Note: *: Equal Contribution, †: Graduate Student, ◇: Postdoctoral Fellow

Under Review/Revision

[6] Mohlmann, M., Lalor, J. P., Son, Y., & Berente, N. Inflation in reputation systems? Newcomers, veterans, and socialization into a platform context. Minor revision (after 5th round) at Information Systems Research.
[5] Zheng, S.,† Lalor, J. P., & Chen, Y. Diversifying recommendations on digital platforms: A dynamic graph neural network approach. Major revision (after 1st round) at Management Science.
[4] Li, S.,† Lalor, J. P., Ahmad, F., Abbasi, A., & Chawla, N. Modeling edge-rich graphs using neural networks. Major revision (after 1st round) at IEEE Transactions on Knowledge and Data Engineering.
[3] Lalor, J. P., Angst, C. M., D’Arcy, J., Nwanganga, F., & Joshi, M. When uniform regulation meets local realities: A theory of distributed regulatory decoupling in the case of GDPR. Under review (1st round) at Information Systems Research.
[2] Pothugunta, K.,◇ & Lalor, J. P. When AI learns to care: Cross-cultural variation in artificial compassion. Under review (1st round) at Information Systems Research Special Issue on Compassionate AI.
[1] Oketch, K.,† Lalor, J. P., Zhang, D., & Abbasi, A. Is linguistic variation signal or noise? A taxonomy-guided evaluation of sociolinguistic diversity in Swahili NLP. Under review (1st round) at MIS Quarterly.

Journal Articles

[J15] Meng, G.,† Zeng, Q.,† Lalor, J. P., & Yu, H. (2025). A Psychology-based unified dynamic framework for curriculum learning. Computational Linguistics, 1–49.
[J14] Krishnan, R., Lalor, J. P., Prat, N., & Abbasi, A. (2025). From policy to practice: Research directions for trustworthy and responsible artificial intelligence “by design”. IEEE Intelligent Systems, 40(5), 45–51.
[J13] Li, W.,† Lalor, J. P., Chen, Y., & Kanuri, V. K. (2025). From stars to insights: Exploration and implementation of unified sentiment analysis with distant supervision. ACM Transactions on Management Information Systems, 16(3), 1–21.
[J12] Yang, Y., Lalor, J. P., Abbasi, A., & Zeng, D. D. (2025). Hierarchical deep document model. IEEE Transactions on Knowledge and Data Engineering, 37(1), 351–364.
[J11] Lalor, J. P., Abbasi, A., Oketch, K.,† Yang, Y., & Forsgren, N. (2024). Should fairness be a metric or a model? A model-based framework for assessing bias in machine learning pipelines. ACM Transactions on Information Systems, 42(4), 1–41.
- ACM TOIS Editors’ Pick for Notable Papers.
- Selected for presentation at ACM SIGIR 2024 (approximately 10–12% of annual TOIS publications are invited).
- Mendoza Mission Research Award, 2025.
[J10] Safadi, H., Lalor, J. P., & Berente, N. (2024). The effect of bots on human interaction in online communities. MIS Quarterly, 48(3), 1279–1296.
[J9] Lalor, J. P., Levy, D. A., Jordan, H. S., Hu, W., Smirnova, J. K., & Yu, H. (2024). Evaluating expert-layperson agreement in identifying jargon terms in electronic health record notes: Observational study. Journal of Medical Internet Research, 26, e49704.
[J8] Levy, D. A., Jordan, H. S., Lalor, J. P., Smirnova, J. K., Hu, W., Liu, W., & Yu, H. (2024). Individual factors that affect laypeople’s understanding of definitions of medical jargon. Health Policy and Technology, 13(6), 100932.
[J7] Lalor, J. P., & Rodriguez, P. (2023). py-irt: A scalable item response theory library for Python. INFORMS Journal on Computing, 35(1), 5–13.
- INFORMS ISS Design Science Award, 2025.
[J6] Wowak, K. D., Lalor, J. P., Somanchi, S., & Angst, C. M. (2023). Business analytics in healthcare: Past, present, and future trends. Manufacturing & Service Operations Management, 25(3), 975–995.
[J5] Lalor, J. P., Wu, H., Mazor, K. M., & Yu, H. (2023). Evaluating the efficacy of NoteAid on EHR note comprehension among US Veterans through Amazon Mechanical Turk. International Journal of Medical Informatics, 172, 105006.
[J4] Lalor, J. P., Hu, W., Tran, M., Wu, H., Mazor, K. M., & Yu, H. (2021). Evaluating the effectiveness of NoteAid in a community hospital setting: Randomized trial of electronic health record note comprehension interventions with patients. Journal of Medical Internet Research, 23(5), e26354.
[J3] Chen, J., Lalor, J. P., Liu, W., Druhl, E., Granillo, E., Vimalananda, V. G., & Yu, H. (2019). Detecting hypoglycemia incidents reported in patients’ secure messages: Using cost-sensitive learning and oversampling to reduce data imbalance. Journal of Medical Internet Research, 21(3), e11990.
- Also presented at the 2018 American Medical Informatics Association (AMIA) Annual Symposium.
[J2] Lalor, J. P., Woolf, B., & Yu, H. (2019). Improving electronic health record note comprehension with NoteAid: Randomized trial of electronic health record note comprehension interventions with crowdsourced workers. Journal of Medical Internet Research, 21(1), e10793.
[J1] Lalor, J. P., Wu, H., Chen, L., Mazor, K. M., & Yu, H. (2018). ComprehENotes, an instrument to assess patient reading comprehension of electronic health record notes: Development and validation. Journal of Medical Internet Research, 20(4), e139.
- Also presented at the 2017 American Medical Informatics Association (AMIA) Annual Symposium.

Computer Science Conference Proceedings

Note: This list includes papers in computer science conference proceedings that have been listed as “Top Conferences” according to the NYU Technology, Operations and Statistics Department Top Publication Venues (link).

[TC8] Lalor, J. P., Qin, R.,† Dobolyi, D., & Abbasi, A. (2025, July). Textagon: Boosting language models with theory-guided parallel representations. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) (pp. 82–92).
[TC7] Cook, R. A.,† Lalor, J. P., & Abbasi, A. (2025, April). No simple answer to data complexity: An examination of instance-level complexity metrics for classification tasks. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) (pp. 2553–2573).
[TC6] Lalor, J. P., Yang, Y., Smith, K., Forsgren, N., & Abbasi, A. (2022, July). Benchmarking intersectional biases in NLP. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 3598–3609).
[TC5] Abbasi, A., Dobolyi, D., Lalor, J. P., Netemeyer, R. G., Smith, K., & Yang, Y. (2021, November). Constructing a psychometric testbed for fair natural language processing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 3748–3758).
- Authors listed alphabetically.
[TC4] Rodriguez, P., Barrow, J., Hoyle, A. M., Lalor, J. P., Jia, R., & Boyd-Graber, J. (2021, August). Evaluation examples are not equally informative: How should that change NLP leaderboards?. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 4486–4503).
[TC3] Lalor, J. P., Wu, H., & Yu, H. (2019, November). Learning latent parameters without human response patterns: Item response theory with artificial crowds. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 4249–4259).
- Also presented at the 2019 Workshop on Shortcomings in Vision and Language (SiVL).
[TC2] Lalor, J. P., Wu, H., Munkhdalai, T., & Yu, H. (2018). Understanding deep learning performance through an examination of test set difficulty: A psychometric case study. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4711–4716).
[TC1] Lalor, J. P., Wu, H., & Yu, H. (2016, November). Building an evaluation scale using item response theory. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 648–657).

Additional Peer-Reviewed Conference and Workshop Proceedings

[P30] Pothugunta, K.,◇ & Lalor, J. P. (To appear). Carefully Considering Culture: Analyzing LLM Alignment in Single- and Multi-Cultural Settings using Cultural Consensus Theory. In Findings of the Association for Computational Linguistics: ACL 2026.
[P29] Meng, G.,† Gu, P., Liang, P., Lalor, J. P., Chambers, E. W., & Chen, D. Z. (To appear). TopoCL: Topological Contrastive Learning for Medical Imaging. In Proceedings of the 2026 Computer Vision and Pattern Recognition Conference (CVPR).
[P28] Chen, S.,† Lalor, J. P., Yang, Y., & Abbasi, A. (2025, July). PersonaTwin: A multi-tier prompt conditioning framework for generating and evaluating personalized digital twins. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²) (pp. 774–788).
[P27] Oketch, K.,† Lalor, J. P., & Abbasi, A. (2025). Cultural artifacts, tribal heterogeneity, and language models. In the 46th AIS International Conference on Information Systems (ICIS).
- Also presented at the 2025 Workshop on Information Technology and Systems (WITS).
[P26] Oketch, K.,† Lalor, J. P., Yang, Y., & Abbasi, A. (2025). Bridging the LLM accessibility divide? Performance, fairness, and cost of closed versus open LLMs for automated essay scoring. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²) (pp. 655–669).
[P25] Prat, N., Lalor, J. P., & Abbasi, A. (2025, May). GALEA-Leveraging generative agents in artifact evaluation. In International Conference on Design Science Research in Information Systems and Technology (pp. 83–98).
[P24] Yang, Y., Duan, H.,† Abbasi, A., Lalor, J. P., & Tam, K. Y. (2025, May). Bias a-head? Analyzing bias in transformer-based language model attention heads. In Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025) (pp. 276–290).
[P23] Lalor, J. P., Somanchi, S., Nwanganga, F., D’Arcy, J., & Angst, C. M. (2024). It’s not what you say, it’s how you say it: Investigating GDPR enforcement variation in the EU. In Academy of Management Proceedings (Vol. 2024, No. 1, p. 17252).
- Also presented at the Twentieth Symposium on Statistical Challenges in Electronic Commerce Research (SCECR).
[P22] Lalor, J. P., Rodriguez, P., Sedoc, J., & Hernandez-Orallo, J. (2024, March). Item response theory for natural language processing. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts (pp. 9–13).
[P21] Li, W.,† Chen, Y., Zheng, S., Wang, L., & Lalor, J. P. (2024, March). Stars are all you need: A distantly supervised pyramid network for unified sentiment analysis. In Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024) (pp. 104–118).
[P20] Duan, X.,† & Lalor, J. P. (2023). H-COAL: Human correction of AI-generated labels for biomedical named entity recognition. In Conference on Information Systems and Technology.
[P19] Lalor, J. P. (2023). Ranking pull requests in open source software. In Academy of Management Proceedings (Vol. 2023, No. 1, p. 12665).
[P18] Lalor, J. P. (2022). On-the-fly difficulty estimation for deep neural networks. In 2022 INFORMS Annual Meeting.
[P17] Rodriguez, P.,† Htut, P. M.,† Lalor, J. P., & Sedoc, J. (2022, May). Clustering examples in multi-dataset benchmarks with item response theory. In Proceedings of the Third Workshop on Insights from Negative Results in NLP (pp. 100–112).
[P16] Berente, N., Lalor, J. P., Somanchi, S., & Abbasi, A. (2021). The illusion of certainty and data-driven decision making in emergent situations. In AIS International Conference on Information Systems (ICIS).
[P15] Safadi, H., Lalor, J. P., & Berente, N. (2021). The effect of bots on human interaction in online communities. In AIS International Conference on Information Systems (ICIS).
- Best Theory Paper Award.
- Also presented at the 2020 INSNA Sunbelt Conference.
[P14] Lalor, J. P., & Guo, H. (2021). Measuring algorithmic interpretability. In 2021 INFORMS Annual Meeting.
- Also presented at the 2020 INFORMS Workshop on Data Science.
[P13] Lalor, J. P., Hu, W., Tran, M., Mazor, K., & Yu, H. (2021). Does defining medical jargon in a community hospital setting improve comprehension? In 2021 INFORMS Healthcare Conference.
[P12] Lalor, J. P., & Yu, H. (2020, November). Dynamic data selection for curriculum learning via ability estimation. In Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 545–555).
[P11] Ma, M. C., & Lalor, J. P. (2020, November). An empirical analysis of human-bot interaction on reddit. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020) (pp. 101–106).
[P10] Cho, E., Xie, H., Lalor, J. P., Kumar, V., & Campbell, W. M. (2019, December). Efficient semi-supervised learning for natural language understanding by optimizing diversity. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (pp. 1077–1084). IEEE.
[P9] Lalor, J. P., Wu, H., & Yu, H. (2019). Comparing human and DNN-ensemble response patterns for item response theory model fitting. In the 2019 Workshop on Cognitive Modeling and Computational Linguistics.
[P8] Lalor, J. P., Wu, H., & Yu, H. (2018). Modeling difficulty to understand deep learning performance. In the 2018 Northern Lights Deep Learning Workshop (NLDL).
[P7] Lalor, J. P., Wu, H., & Yu, H. (2018). Soft label memorization-generalization for natural language inference. In 2018 UAI Workshop on Uncertainty in Deep Learning.
[P6] Lalor, J. P., Wu, H., & Yu, H. (2017). CIFT: Crowd-informed fine-tuning to improve machine learning ability. In Human Computation and Crowdsourcing (HCOMP).
[P5] Munkhdalai, T., Lalor, J. P., & Yu, H. (2016, November). Citation analysis with neural attention models. In Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis (pp. 69–77).
[P4] Miller, C. S., Settle, A., & Lalor, J. P. (2015, September). Learning object-oriented programming in python: Towards an inventory of difficulties and testing pitfalls. In Proceedings of the 16th Annual Conference on Information Technology Education (pp. 59–64).
[P3] Settle, A., Lalor, J. P., & Steinbach, T. (2015, September). Evaluating a linked-courses learning community for development majors. In Proceedings of the 16th Annual Conference on Information Technology Education (pp. 127–132).
[P2] Settle, A., Lalor, J. P., & Steinbach, T. (2015, June). A computer science linked-courses learning community. In Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education (pp. 123–128).
[P1] Settle, A., Lalor, J. P., & Steinbach, T. (2015, February). Reconsidering the impact of CS1 on novice attitudes. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education (pp. 229–234).

Working Papers

[7] Lalor, J. P., Guo, H., Recker, J., Berente, N., & Abbasi, A. Measuring algorithmic interpretability: A human-learning-based framework and corresponding cognitive complexity score.
[6] Costello, J., Chen, Y., Lalor, J. P., & Guo, R. Rate before you write: How the presence and positioning of multidimensional attribute ratings influence attrition in online reviews.
[5] Lalor, J. P., & Qu, X. On the production and spread of news in a digital age.
[4] Cook, R.,† Lalor, J. P., & Abbasi, A. CADE: Classification with automatic difficulty estimation.
[3] Lalor, J. P., Kanuri, V., & Chakraborty, I. FEWD: A fused explainable model using wide and deep networks for synthesizing multi-modal content.
[2] Lalor, J. P., & Just, R. Ranking pull requests in open source software.
[1] Lim, J. H., Kwon, S., Yao, Z., Lalor, J. P., & Yu, H. Large language model-based role-playing for personalized medical jargon extraction. https://arxiv.org/abs/2408.05555