This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool (PAT) developed by Trane Technologies. The PAT, based on OpenAI’s GPT 3.5 model, was deployed on Microsoft Azure to ensure secure access and protection of intellectual property. To assess the tool’s productivity effectiveness, an experiment was conducted comparing the completion times and content quality of four common office tasks: writing an email, summarizing an article, creating instructions for a simple task, and preparing a presentation outline. Sixty-three (63) participants were randomly divided into a test group using the PAT and a control group performing the tasks manually. Results indicated significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%. The study further analyzed factors such as the age of users, response word counts, and quality of responses, revealing that the PAT users generated more verbose and higher-quality content. Writing email content improved by 3.3%, summarizing text improved by 69%, creating instructions improved by 45.9%, and preparing an outline improved by 24.8%. An ’LLM-as-a-judge’ method employing GPT-4 was used to grade the quality of responses, which effectively distinguished between high and low-quality outputs. The findings underscore the potential of PATs in enhancing workplace productivity and highlight areas for further research and optimization.
Published in | American Journal of Artificial Intelligence (Volume 8, Issue 2) |
DOI | 10.11648/j.ajai.20240802.16 |
Page(s) | 68-80 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2024. Published by Science Publishing Group |
Generative AI, Productivity, Employee Assistant, Task Analysis, AI Integration
[1] | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems. 2017, 30, 5998- 6008. |
[2] |
OpenAI. Introducing ChatGPT. [Internet]. Available from:
https://openai.com/index/chatgpt/ . [Accessed 22 May 2024]. |
[3] | Lindsey Wilkinson. How P&G rolled out its internal generative AI model. [Internet]. Available from: https://www.ciodive.com/news/procter-gamble- PG-chatgp-AI-openAI/697067/. [Accessed 25 May 2024]. |
[4] | Shakked Noy, Whitney Zhang. Experimental evidence on the productivity effects of generative artificial intelligence. Science. 2023, 381(6654), 187-192. |
[5] | F. Dell’Acqua, E. McFowland III, E. Mollick, H. Lifshitz-Assaf, K. C. Kellogg, S. Rajendran, L. Krayer, F. Candelon, K. R. Lakhani. Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Technical Report 24-013, Harvard Business School, September 2023. |
[6] | Erik Brynjolfsson, Danielle Li, Lindsey Raymond. Generative AI at work. [Internet]. Available from: https://www.nber.org/system/files/working papers/w311 61/w31161.pdf |
[7] | Jakob Nielsen. AI improves employee productivity by 66%. [Internet]. Available from: https://www.nngroup.com/articles/ai-tools-productivity- gains/. [Accessed 27 May 2024]. |
[8] | Sida Peng, Eirini Kalliamvakou, Peter Cihon, Mert Demirer. The impact of AI on developer productivity: Evidence from github copilot, 2023. |
[9] | Samia Kabir, David Udo-Imeh, Bonan Kou, Tianyi Zhang. Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions, 2024. |
[10] | Patrick E. McKnight, Julius Najab. Mann-Whitney U Test. In Methods in behavioral research, Edition. John Wiley: New York, NY, USA; 2010, pp. 1-1. |
[11] | M Morris, V Venkatesh. Age differences in technology adoption decisions: Implications for a changing work force. Personnel Psychology. 2006. |
[12] | C. Spearman. The proof and measurement of association between two things. The American Journal of Psychology. 1904, 15(1), 72-101. |
[13] | David C. Hoaglin. Volume 16: How to detect and handle outliers, 2013. [Internet]. Available from: |
[14] | Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, NicoleLimtiaco, RhomniSt. John, NoahConstant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil. Universal sentence encoder. CoRR, abs/1803.11175, 2018. [Internet]. Available from: |
[15] | R. L. Boyd, A. Ashokkumar, S. Seraj, J. Pennebaker. The development and psychometric properties of LIWC-22. University of Texas at Austin, 2022. [Internet]. Available from: |
[16] | Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, Ion Stoica. Judging LLM-as-a-judge with mt-bench and chatbot arena, 2023. |
[17] | Quinn Leng, Kasey Uhlenhuth, Alkis Polyzotis. Best practices for LLM evaluation of RAG applications. [Internet]. Available from: https://www.databricks.com/blog/LLM-auto-eval-best- practices-RAG. [Accessed 7 June 2024]. |
APA Style
Freeman, B. S., Arriola, K., Cottell, D., Lawlor, E., Erdman, M., et al. (2024). Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool. American Journal of Artificial Intelligence, 8(2), 68-80. https://doi.org/10.11648/j.ajai.20240802.16
ACS Style
Freeman, B. S.; Arriola, K.; Cottell, D.; Lawlor, E.; Erdman, M., et al. Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool. Am. J. Artif. Intell. 2024, 8(2), 68-80. doi: 10.11648/j.ajai.20240802.16
@article{10.11648/j.ajai.20240802.16, author = {Brian S. Freeman and Kendall Arriola and Dan Cottell and Emmett Lawlor and Matt Erdman and Trevor Sutherland and Brian Wells}, title = {Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool}, journal = {American Journal of Artificial Intelligence}, volume = {8}, number = {2}, pages = {68-80}, doi = {10.11648/j.ajai.20240802.16}, url = {https://doi.org/10.11648/j.ajai.20240802.16}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20240802.16}, abstract = {This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool (PAT) developed by Trane Technologies. The PAT, based on OpenAI’s GPT 3.5 model, was deployed on Microsoft Azure to ensure secure access and protection of intellectual property. To assess the tool’s productivity effectiveness, an experiment was conducted comparing the completion times and content quality of four common office tasks: writing an email, summarizing an article, creating instructions for a simple task, and preparing a presentation outline. Sixty-three (63) participants were randomly divided into a test group using the PAT and a control group performing the tasks manually. Results indicated significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%. The study further analyzed factors such as the age of users, response word counts, and quality of responses, revealing that the PAT users generated more verbose and higher-quality content. Writing email content improved by 3.3%, summarizing text improved by 69%, creating instructions improved by 45.9%, and preparing an outline improved by 24.8%. An ’LLM-as-a-judge’ method employing GPT-4 was used to grade the quality of responses, which effectively distinguished between high and low-quality outputs. The findings underscore the potential of PATs in enhancing workplace productivity and highlight areas for further research and optimization.}, year = {2024} }
TY - JOUR T1 - Evaluation of Task Specific Productivity Improvements Using a Generative Artificial Intelligence Personal Assistant Tool AU - Brian S. Freeman AU - Kendall Arriola AU - Dan Cottell AU - Emmett Lawlor AU - Matt Erdman AU - Trevor Sutherland AU - Brian Wells Y1 - 2024/12/18 PY - 2024 N1 - https://doi.org/10.11648/j.ajai.20240802.16 DO - 10.11648/j.ajai.20240802.16 T2 - American Journal of Artificial Intelligence JF - American Journal of Artificial Intelligence JO - American Journal of Artificial Intelligence SP - 68 EP - 80 PB - Science Publishing Group SN - 2639-9733 UR - https://doi.org/10.11648/j.ajai.20240802.16 AB - This study evaluates the productivity improvements achieved using a generative artificial intelligence personal assistant tool (PAT) developed by Trane Technologies. The PAT, based on OpenAI’s GPT 3.5 model, was deployed on Microsoft Azure to ensure secure access and protection of intellectual property. To assess the tool’s productivity effectiveness, an experiment was conducted comparing the completion times and content quality of four common office tasks: writing an email, summarizing an article, creating instructions for a simple task, and preparing a presentation outline. Sixty-three (63) participants were randomly divided into a test group using the PAT and a control group performing the tasks manually. Results indicated significant productivity enhancements, particularly for tasks involving summarization and instruction creation, with improvements ranging from 3.3% to 69%. The study further analyzed factors such as the age of users, response word counts, and quality of responses, revealing that the PAT users generated more verbose and higher-quality content. Writing email content improved by 3.3%, summarizing text improved by 69%, creating instructions improved by 45.9%, and preparing an outline improved by 24.8%. An ’LLM-as-a-judge’ method employing GPT-4 was used to grade the quality of responses, which effectively distinguished between high and low-quality outputs. The findings underscore the potential of PATs in enhancing workplace productivity and highlight areas for further research and optimization. VL - 8 IS - 2 ER -