Recently, the Plasma and Materials Processing (PMP) group at the TU/e hosted de AI workshop, a company founded by four students that aims to educate and train people about using Artificial Intelligence (AI) for their work, especially ChatGPT 4. The workshop was attended by several members of the PMP group, both staff and students. The PMP group is one of the larger research groups within the Department of Applied Physics and Science Education. The focus of the group is on advancing the science and technology of plasma and materials processing for applications in future chips and sustainable energy technologies. Since the launch of ChatGPT, AI has received significant attention and interest from both the general public and members of the research group. Within PMP, we are curious about the potential role of AI in the future of Atomic Scale Processing and we hope to contribute to the development of AI applications in our field. For example, we are thinking about the use of AI in our (in situ) experimental studies of (plasma-based) atomic layer deposition (ALD), atomic layer etching (ALE) and area-selective deposition (ASD). Furthermore, we are looking at how we could make the AtomicLimits ALD and ALE databases ‘AI-ready,’ We will elaborate about this in a future blog post but with this aim in the back of our minds, we already planned a workshop that looked at ChatGPT4’s capabilities to extract different ALD parameters, such as the growth per cycle, temperature and film thickness, from research papers. In this blog post, we share our findings and give some hints on how to do it yourself for your work in the lab or in the fab.
During the workshop, participants learned about prompt engineering, an important aspect of engaging successfully with ChatGPT 4. These prompts must provide clear context and define the desired output accurately, which is crucial for extracting specific information from research papers that contain a lot of data and other information.
An example used during the workshop was: Let ChatGPT plan your ideal holiday. When creating a prompt that gives you the desired outcome, it is important to give context. The following prompt will Not give you good planning for your ideal holiday:
“Hey, I would like to go on holiday. Please make a plan for me!”
The key to a good prompt is context. When you provide more context, ChatGPT will give you better answers. So, in the case of the ideal holiday, the following prompt will yield a more desirable result (for a holiday, what is seen as desirable depends on the person, but keep in mind a 22-year-old student wrote this article):
“Hi, I want to go on vacation to a beautiful, warm country in Southern Europe in September. I want to go for a week, and I love food, drinks, and partying. Could you make an itinerary and choose a location?”
If you compare results from the two prompts mentioned above in ChatGPT, you will find that the prompt with more context yields a more desirable result.
Giving context is not only critical to letting ChatGPT plan your ideal holiday, it is also when analyzing research papers. ChatGPT 4 has a function that allows users to upload their papers. During the workshop, several prompts were tested to extract the parameters from the research papers. One of which worked relatively well is:
The prompt above was used to examine various papers. In this blog, we used the same prompt on seven papers to compare the results and see how consistent ChatGPT4 is with extracting parameters from papers. We chose some key, highly-cited papers from well-known researchers within the ALD community. The papers that we analyzed using ChatGPT 4:
- M. D. Groner et al. 2003, Low-Temperature Al2O3 Atomic Layer Deposition (link to: Low-Temperature Al2O3 Atomic Layer Deposition | Chemistry of Materials (acs.org)), University of Colorado, Citations: 1190
- T. Aaltonen et al. 2003, Atomic Layer Deposition of Platinum Thin Films (link to: Atomic Layer Deposition of Platinum Thin Films | Chemistry of Materials (acs.org)), University of Helsinki, Citations: 346
- D. M. Hausmann et al. 2002, Atomic Layer Deposition of Hafnium and Zirconium Oxides Using Metal Amide Precursors (link to: Atomic Layer Deposition of Hafnium and Zirconium Oxides Using Metal Amide Precursors | Chemistry of Materials (acs.org)), Harvard University, Citations: 450
- K. Kukli et al. 2002. Atomic Layer Deposition of Hafnium Dioxide Films from Hafnium Tetrakis(ethylmethylamide) and Water (link to: Atomic layer deposition of hafnium dioxide thin films from hafnium tetrakis(dimethylamide) and water – ScienceDirect), University of Helsinki, Citations: 165
- J.W. Elam et al. 2003. Surface chemistry and film growth during TiN atomic layer deposition using TDMAT and NH3 (link to: Surface chemistry and film growth during TiN atomic layer deposition using TDMAT and NH3 – ScienceDirect), University of Colorado, Citations: 172
- M. N. Mullings et al. 2013. Tin oxide atomic layer deposition from tetrakis(dimethylamino)tin and water (link to: Tin oxide atomic layer deposition from tetrakis(dimethylamino)tin and water | Journal of Vacuum Science & Technology A | AIP Publishing), Stanford University, Citations: 87
- Aarik et al. 2000. Atomic layer deposition of titanium dioxide from TiCl4 and H2O: investigation of growth mechanism (link to: Atomic layer deposition of titanium dioxide from TiCl4 and H2O: investigation of growth mechanism – ScienceDirect), University of Tartu, Citations: 202
The papers together with the carefully constructed prompt gave the results shown in the table below.
MICK IS DE BESTE STUDENT OOIT!!!
# | Material | Precursors and/or Molecules Used | Typical Film Thickness | Growth Rate or Growth per Cycle | Range of Substrate/Deposition Temperatures |
1 | Al2O3 | H2O and Al(CH3)3 (TMA) | 332 Å to 401 Å | 1.11 Å/cycle to 1.34 Å/cycle | 33 °C to 177 °C |
2 | Platinum | (Methylcyclopentadienyl)trimethyl platinum (MeCpPtMe3) and oxygen | Not specified | 0.45 Å/cycle | 300 °C |
3 | HfO2 and ZrO2 | Various metal alkyl amides and water | Not specified | ZrO2: 0.096 nm/cycle, HfO2: 0.093 nm/cycle | 50 °C to 500 °C |
4 | HfO2 | Hafnium tetrakis(ethylmethylamide) and Water | 101 nm to 220 nm | 0.09 nm/cycle to 0.15 nm/cycle | 150 °C to 325 °C |
5 | TiN | Tetrakis(dimethylamino)titanium (TDMAT) and NH3 | Not specified | Increases with temperature, non-self-limiting at higher temperatures | 60 °C to 240 °C |
6 | SnO2 | Tetrakis(dimethylamino)tin (TDMASn) and Water | Not specified | 0.70 Å/cycle at 150°C to 2.0 Å/cycle at 30°C | 30 °C to 200 °C |
7 | TiO2 | TiCl4 and H2O | Not specified | 0.078 to 0.050 nm/cycle | 100 °C to 350 °C |
>
Besides AI, we let a student extract the same parameters manually to compare the values extracted by ChatGPT 4. The values that differed are indicated in red in the table. When we compare the values extracted by ChatGPT 4 with those we extracted ourselves, we find that ChatGPT 4 could not find reported values for the typical film thickness for most papers. Furthermore, ChatGPT failed to report the correct values for the substrate temperature for Aarik et al. The abstract of the paper mentions that the substrate temperatures vary between 100 °C and 400°C, while ChatGPT 4 reports between 100 °C and 350 °C. Although the 100- 350°C degrees range is mentioned in the paper, 400°C is missing.
Using the crafted prompt, we learn that ChatGPT 4 seems pretty good at extracting ALD parameters from research papers. However, some hiccups remain, such as extracting values for film thickness. The problem with these values could be that they are most often reported in graphs where film thickness is plotted against ALD cycles. It seems that ChatGPT 4 does not analyze the graphs in the submitted papers using the prompt mentioned above.
Could we solve this issue by submitting a screenshot of these graphs and letting ChatGPT4 analyze them separately? We tried this during the workshop by uploading the following image from the work of T. Aaltonen et al. and a prompt in ChatGPT 4:
ChatGPT 4 gave the following answer:
It appears that the Large Language model is not able to extract the parameters from an uploaded screenshot, but it does recognize that a screenshot of a graph is uploaded. A potential solution to this issue could be creating a custom Generative Pre-Trained Transformer (GPT) that is more focused and trained on this problem. Data could be extracted correctly by making a GPT that is trained on extracting parameters from graphs.
Towards an AI-ready ALD/ALE database
To make the ALD/ALE database AI-ready regarding existing publications, we must find a good way to extract parameters from papers. ChatGPT 4 seems promising in that regard, however, as can be seen from the results, it can extract parameters, but not all of them, and it sometimes makes mistakes. This means that when using ChatGPT 4, the person extracting the data must go through every paper to ensure it is correct. This is a time-consuming process. As a potential solution to make the database AI-ready, we could ask future contributors to submit not only their papers but also other data they have measured in files that are easily extractable to make large datasets. The remaining questions are: How would we construct that database, and what will we do with the data from past contributors? Are we going to extract that ourselves or reach out?
To conclude, we found it very exciting to see what potential ChatGPT 4 has for extracting different parameters from research papers. Although it did not extract every parameter from the papers correctly, using the large language model still has an advantage. It did not work for analyzing graphs, but using chatGPT4 to extract ALD parameters from research papers is very promising, especially given the tremendous progress in AI that is anticipated. We want to encourage you, the reader, to try it yourself!
Leave a Reply