An explanation about the remaining 35% of the English sentences not on List 907
These include sentences that had outright errors.
These include sentences that didn't sound like what a native speaker would say, even though the intended meaning can be understood.
These include "near duplicates". (For example, in cases where there are both "Tom swims." and "Mary swims.", usually only the sentences with "Tom" as the subject were put on List 907. -- See bit.ly/tatoebawildcards.)
These include very long items, usually multiple-sentence entries, that I felt weren't useful for my own projects, so I filtered them out without reading them.
These obviously also include sentences that I haven't yet proofread.
Criticism That I Have Received About This List
A member has criticized me for suggesting that members limit themselves to only translating sentences that I have proofread and selected to use on my own projects.
While I do not think there is anything wrong with directing members to good sentences, here are likely some reasons people will not like my list.
I try to not include anything that is obscene or culturally-insensitive since I use these sentences on a website that I aim at people around the world, including school children.
I try to not include too many near duplicates since my website is aimed at people who can very likely create such near duplicates themselves.
I try to not include nonsense sentences, strange sentences, factually-inaccurate sentences, etc. Such decisions can be subjective.
Why I Started This List
When I first started using the Tatoeba Corpus, I noticed a lot of errors. With so many errors, I did not want to use the whole corpus for any project that was designed for students learning English.
While at first glance it might seem that all these errors could be corrected, that isnt't possible.
New sentences with errors keep coming in to the database.
Some "owners" of sentences are resistant to making corrections.
Some "owners" of sentences get upset when corrections are suggested, so to keep tatoeba.org a friendly place, it is simpler to not make suggestions for corrections.
Some sentences with errors have too many linked translations that are affected by any change of the English sentence.
... and a few other reasons.
It soon became obvious that it would be a more effective use of my time to proofread and select sentences to use.
Recently, since the list has grown so large, I have been more careful not to add sentences that are near duplicates of sentences already on the list. Students using my projects can likely very easily create such sentences on their own.