Here is a list of about 800 stop words made based on 4 million documents (I started with this set). This set has helped us reduce model size and increase accuracy, please note that the same list may not be applicable in your application, please review the list before using.
More interestingly here list of commonly concatenated words that I found and their corrections.