As teachers, we found that first year engineering students at our university (KMUTT) were not linguistically prepared for the kind of language they would be reading in their textbooks and writing in their papers. We also found that there were no suitable resources for engineering students to study with.

Materials such as the AWL (Academic Word List) exist, but they are generally geared towards high-level academic language, and cover a wide range of disciplines, so a lot of that language would not be useful for these students.

The primary goal of the website is to provide vocabulary materials for engineering students in the EFL (English as a Foreign Language) context based on natural data and sound corpus linguistics principles. Especially, these materials are designed to help Thai first year students prepare for the kind of language they will encounter in their textbooks, throughout their university career.


In order to create the materials available on this website, we first find the most frequently occurring words and phrases. Next, we make sure that the words and phrases we’ve located are applicable to the majority of the engineering sub-disciplines.

For example, let’s say the word “hydrocarbon” is in the top 5% of words, so we should consider it for inclusion in the list. However, if it only occurs in texts for chemical engineers, we will not include it, because it is not broadly useful for engineers in general.

The importance of using real natural language as the source of our data cannot be stressed. Because this is language that we know our students will encounter we address the words and phrases that students are most likely to encounter often, and make sure they get a handle on that essential vocabulary.

This empirical approach allows us to remove the guesswork and opinion out of the materials and focus on what students actually have to know to be able to understand their texts.


The corpus consists of over 1.15M words from 29 engineering textbooks in the following twelve sub-disciplines of engineering:

Disciplines Included
Civil Engineering
Mechanical Engineering
Computer Engineering
Chemical Engineering
Environmental Engineering
Electrical Engineering
Materials Engineering
Production Engineering
Tool Engineering
Control Systems and Instrumentation
Electronics and Telecommunication
Textbook # of Words
1. Biology 42,857
2. C++ 50,103
3. Calculus 59,326
4. Chemical engineering 46,509
5. Chemistry 45,350
6. Database 52,811
7. Data structure 35,789
8. Discrete mathematics 50,991
9. Circuits and circuit analysis 34,585
10. Engineering materials 53,426
11. Engineering programming 29,165
12. Environmental pollution 34,235
13. Environmental engineering 40,861
14. Fluid mechanics 39,138
15. Hydraulic fluids 42,174
16. Java 28,049
17. Manufacturing processes 61,837
18. Material and energy balance 21,950
19. Mechanical solids 26,501
20. Physics 88,978
21. Statics and dynamics 50,302
22. Statics 36,888
23. Structural analysis 36,826
24. Surveying 48,353
25. Technical drawing 69,228
26. Thermodynamics 54,149
27. Wastewater management 24,144
Total 1,204,525*

We selected these twelve sub-disciplines because we most often teach students from these twelve departments.

* Note that after processing to account for unusable characters, the wordcount is ~1.15M words


  1. Create materials for instructors who are teaching English to engineering students.
  2. Make these materials available to researchers in order to facilitate research.
  3. Make online exercises for students for self-study.
  4. To develop a corpus based on first-year engineering textbooks.
  5. In the long term, we hope to increase the number and variety of materials. Also, we would like to invite submissions of relevant materials from others that we could include on this website.

In the future

In the near future, we will develop similar exercises and materials for science and IT teachers, students, and researchers. The corpora for science and IT are complete, but the exercises, word lists, and so on, have not been started yet.

Contact Us

We are always interested in improving our materials. Please contact us if you have any question, or suggestion, or if you have found a mistake or error.

We can be contacted at by clicking either of our names:

Cite Us

If you would like to cite these materials, please use the following form:

Osment, C., Graham, D. (2013, June 26) CEEM: Corpus-driven Engineering English Materials. Available from: http://crs2.kmutt.ac.th/ceem