OK, I appreciate this.. here goes. I think I need a complete redesign.
Currently I have two tables in a database, one called kwlist. The other one called adata.
Kwlist has two fields one is a list of unique words or phrases, the other is a primary key, incrementing integer.
Adata has multiple fields, one being adataid an incrementing primary key (integer). Then 'kw' which is a plain text version of a word from the kwlist. Would it make my database more efficient if I replaced the plaintext kw in adata with it's related ID in the KWlist table?
After this we have company name (often repeating), text data (repeating), more text data (repeating), Date added, Time added, a true/false field.
Problem with company name is that they are constantly being added so i'm not sure about the link table thing.
I have no idea how to make it so repeating data isn't added, perhaps add a field called 'last scanned' and if it's a duplicated - update 'last scanned' rather than add the new field?
Problem is, the thing adding all the data is a Visual Studio back end that simply pulls the data and pushes it into a MySQL query. It cannot make intelligent decisions unless I add another mySQL query to check a situation for example..
I hope this all made sense
At this moment i'd be pretty willing to pay someone for a database design because I think it's a real skill that probably takes a long time to develop.