Note: *: Equal Contribution, †: Graduate Student, ◇: Postdoctoral Fellow

Under Review/Revision

  • [6] Mohlmann, M., Lalor, J. P., Son, Y., & Berente, N. Inflation in reputation systems? Newcomers, veterans, and socialization into a platform context. Minor revision (after 5th round) at Information Systems Research.
  • [5] Zheng, S.,† Lalor, J. P., & Chen, Y. Diversifying recommendations on digital platforms: A dynamic graph neural network approach. Major revision (after 1st round) at Management Science.
  • [4] Li, S.,† Lalor, J. P., Ahmad, F., Abbasi, A., & Chawla, N. Modeling edge-rich graphs using neural networks. Major revision (after 1st round) at IEEE Transactions on Knowledge and Data Engineering.
  • [3] Lalor, J. P., Angst, C. M., D’Arcy, J., Nwanganga, F., & Joshi, M. When uniform regulation meets local realities: A theory of distributed regulatory decoupling in the case of GDPR. Under review (1st round) at Information Systems Research.
  • [2] Pothugunta, K.,◇ & Lalor, J. P. When AI learns to care: Cross-cultural variation in artificial compassion. Under review (1st round) at Information Systems Research Special Issue on Compassionate AI.
  • [1] Oketch, K.,† Lalor, J. P., Zhang, D., & Abbasi, A. Is linguistic variation signal or noise? A taxonomy-guided evaluation of sociolinguistic diversity in Swahili NLP. Under review (1st round) at MIS Quarterly.

Journal Articles

  • [J15] Meng, G.,† Zeng, Q.,† Lalor, J. P., & Yu, H. (2025). A Psychology-based unified dynamic framework for curriculum learning. Computational Linguistics, 1–49.
  • [J14] Krishnan, R., Lalor, J. P., Prat, N., & Abbasi, A. (2025). From policy to practice: Research directions for trustworthy and responsible artificial intelligence “by design”. IEEE Intelligent Systems, 40(5), 45–51.
  • [J13] Li, W.,† Lalor, J. P., Chen, Y., & Kanuri, V. K. (2025). From stars to insights: Exploration and implementation of unified sentiment analysis with distant supervision. ACM Transactions on Management Information Systems, 16(3), 1–21.
  • [J12] Yang, Y., Lalor, J. P., Abbasi, A., & Zeng, D. D. (2025). Hierarchical deep document model. IEEE Transactions on Knowledge and Data Engineering, 37(1), 351–364.
  • [J11] Lalor, J. P., Abbasi, A., Oketch, K.,† Yang, Y., & Forsgren, N. (2024). Should fairness be a metric or a model? A model-based framework for assessing bias in machine learning pipelines. ACM Transactions on Information Systems, 42(4), 1–41.
    • ACM TOIS Editors’ Pick for Notable Papers.
    • Selected for presentation at ACM SIGIR 2024 (approximately 10–12% of annual TOIS publications are invited).
    • Mendoza Mission Research Award, 2025.
  • [J10] Safadi, H., Lalor, J. P., & Berente, N. (2024). The effect of bots on human interaction in online communities. MIS Quarterly, 48(3), 1279–1296.
  • [J9] Lalor, J. P., Levy, D. A., Jordan, H. S., Hu, W., Smirnova, J. K., & Yu, H. (2024). Evaluating expert-layperson agreement in identifying jargon terms in electronic health record notes: Observational study. Journal of Medical Internet Research, 26, e49704.
  • [J8] Levy, D. A., Jordan, H. S., Lalor, J. P., Smirnova, J. K., Hu, W., Liu, W., & Yu, H. (2024). Individual factors that affect laypeople’s understanding of definitions of medical jargon. Health Policy and Technology, 13(6), 100932.
  • [J7] Lalor, J. P., & Rodriguez, P. (2023). py-irt: A scalable item response theory library for Python. INFORMS Journal on Computing, 35(1), 5–13.
    • INFORMS ISS Design Science Award, 2025.
  • [J6] Wowak, K. D., Lalor, J. P., Somanchi, S., & Angst, C. M. (2023). Business analytics in healthcare: Past, present, and future trends. Manufacturing & Service Operations Management, 25(3), 975–995.
  • [J5] Lalor, J. P., Wu, H., Mazor, K. M., & Yu, H. (2023). Evaluating the efficacy of NoteAid on EHR note comprehension among US Veterans through Amazon Mechanical Turk. International Journal of Medical Informatics, 172, 105006.
  • [J4] Lalor, J. P., Hu, W., Tran, M., Wu, H., Mazor, K. M., & Yu, H. (2021). Evaluating the effectiveness of NoteAid in a community hospital setting: Randomized trial of electronic health record note comprehension interventions with patients. Journal of Medical Internet Research, 23(5), e26354.
  • [J3] Chen, J., Lalor, J. P., Liu, W., Druhl, E., Granillo, E., Vimalananda, V. G., & Yu, H. (2019). Detecting hypoglycemia incidents reported in patients’ secure messages: Using cost-sensitive learning and oversampling to reduce data imbalance. Journal of Medical Internet Research, 21(3), e11990.
    • Also presented at the 2018 American Medical Informatics Association (AMIA) Annual Symposium.
  • [J2] Lalor, J. P., Woolf, B., & Yu, H. (2019). Improving electronic health record note comprehension with NoteAid: Randomized trial of electronic health record note comprehension interventions with crowdsourced workers. Journal of Medical Internet Research, 21(1), e10793.
  • [J1] Lalor, J. P., Wu, H., Chen, L., Mazor, K. M., & Yu, H. (2018). ComprehENotes, an instrument to assess patient reading comprehension of electronic health record notes: Development and validation. Journal of Medical Internet Research, 20(4), e139.
    • Also presented at the 2017 American Medical Informatics Association (AMIA) Annual Symposium.

Computer Science Conference Proceedings

Note: This list includes papers in computer science conference proceedings that have been listed as “Top Conferences” according to the NYU Technology, Operations and Statistics Department Top Publication Venues (link).

  • [TC8] Lalor, J. P., Qin, R.,† Dobolyi, D., & Abbasi, A. (2025, July). Textagon: Boosting language models with theory-guided parallel representations. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) (pp. 82–92).
  • [TC7] Cook, R. A.,† Lalor, J. P., & Abbasi, A. (2025, April). No simple answer to data complexity: An examination of instance-level complexity metrics for classification tasks. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) (pp. 2553–2573).
  • [TC6] Lalor, J. P., Yang, Y., Smith, K., Forsgren, N., & Abbasi, A. (2022, July). Benchmarking intersectional biases in NLP. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 3598–3609).
  • [TC5] Abbasi, A., Dobolyi, D., Lalor, J. P., Netemeyer, R. G., Smith, K., & Yang, Y. (2021, November). Constructing a psychometric testbed for fair natural language processing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 3748–3758).
    • Authors listed alphabetically.
  • [TC4] Rodriguez, P., Barrow, J., Hoyle, A. M., Lalor, J. P., Jia, R., & Boyd-Graber, J. (2021, August). Evaluation examples are not equally informative: How should that change NLP leaderboards?. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 4486–4503).
  • [TC3] Lalor, J. P., Wu, H., & Yu, H. (2019, November). Learning latent parameters without human response patterns: Item response theory with artificial crowds. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 4249–4259).
    • Also presented at the 2019 Workshop on Shortcomings in Vision and Language (SiVL).
  • [TC2] Lalor, J. P., Wu, H., Munkhdalai, T., & Yu, H. (2018). Understanding deep learning performance through an examination of test set difficulty: A psychometric case study. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4711–4716).
  • [TC1] Lalor, J. P., Wu, H., & Yu, H. (2016, November). Building an evaluation scale using item response theory. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 648–657).

Additional Peer-Reviewed Conference and Workshop Proceedings

  • [P30] Pothugunta, K.,◇ & Lalor, J. P. (To appear). Carefully Considering Culture: Analyzing LLM Alignment in Single- and Multi-Cultural Settings using Cultural Consensus Theory. In Findings of the Association for Computational Linguistics: ACL 2026.
  • [P29] Meng, G.,† Gu, P., Liang, P., Lalor, J. P., Chambers, E. W., & Chen, D. Z. (To appear). TopoCL: Topological Contrastive Learning for Medical Imaging. In Proceedings of the 2026 Computer Vision and Pattern Recognition Conference (CVPR).
  • [P28] Chen, S.,† Lalor, J. P., Yang, Y., & Abbasi, A. (2025, July). PersonaTwin: A multi-tier prompt conditioning framework for generating and evaluating personalized digital twins. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²) (pp. 774–788).
  • [P27] Oketch, K.,† Lalor, J. P., & Abbasi, A. (2025). Cultural artifacts, tribal heterogeneity, and language models. In the 46th AIS International Conference on Information Systems (ICIS).
    • Also presented at the 2025 Workshop on Information Technology and Systems (WITS).
  • [P26] Oketch, K.,† Lalor, J. P., Yang, Y., & Abbasi, A. (2025). Bridging the LLM accessibility divide? Performance, fairness, and cost of closed versus open LLMs for automated essay scoring. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²) (pp. 655–669).
  • [P25] Prat, N., Lalor, J. P., & Abbasi, A. (2025, May). GALEA-Leveraging generative agents in artifact evaluation. In International Conference on Design Science Research in Information Systems and Technology (pp. 83–98).
  • [P24] Yang, Y., Duan, H.,† Abbasi, A., Lalor, J. P., & Tam, K. Y. (2025, May). Bias a-head? Analyzing bias in transformer-based language model attention heads. In Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025) (pp. 276–290).
  • [P23] Lalor, J. P., Somanchi, S., Nwanganga, F., D’Arcy, J., & Angst, C. M. (2024). It’s not what you say, it’s how you say it: Investigating GDPR enforcement variation in the EU. In Academy of Management Proceedings (Vol. 2024, No. 1, p. 17252).
    • Also presented at the Twentieth Symposium on Statistical Challenges in Electronic Commerce Research (SCECR).
  • [P22] Lalor, J. P., Rodriguez, P., Sedoc, J., & Hernandez-Orallo, J. (2024, March). Item response theory for natural language processing. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts (pp. 9–13).
  • [P21] Li, W.,† Chen, Y., Zheng, S., Wang, L., & Lalor, J. P. (2024, March). Stars are all you need: A distantly supervised pyramid network for unified sentiment analysis. In Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024) (pp. 104–118).
  • [P20] Duan, X.,† & Lalor, J. P. (2023). H-COAL: Human correction of AI-generated labels for biomedical named entity recognition. In Conference on Information Systems and Technology.
  • [P19] Lalor, J. P. (2023). Ranking pull requests in open source software. In Academy of Management Proceedings (Vol. 2023, No. 1, p. 12665).
  • [P18] Lalor, J. P. (2022). On-the-fly difficulty estimation for deep neural networks. In 2022 INFORMS Annual Meeting.
  • [P17] Rodriguez, P.,† Htut, P. M.,† Lalor, J. P., & Sedoc, J. (2022, May). Clustering examples in multi-dataset benchmarks with item response theory. In Proceedings of the Third Workshop on Insights from Negative Results in NLP (pp. 100–112).
  • [P16] Berente, N., Lalor, J. P., Somanchi, S., & Abbasi, A. (2021). The illusion of certainty and data-driven decision making in emergent situations. In AIS International Conference on Information Systems (ICIS).
  • [P15] Safadi, H., Lalor, J. P., & Berente, N. (2021). The effect of bots on human interaction in online communities. In AIS International Conference on Information Systems (ICIS).
    • Best Theory Paper Award.
    • Also presented at the 2020 INSNA Sunbelt Conference.
  • [P14] Lalor, J. P., & Guo, H. (2021). Measuring algorithmic interpretability. In 2021 INFORMS Annual Meeting.
    • Also presented at the 2020 INFORMS Workshop on Data Science.
  • [P13] Lalor, J. P., Hu, W., Tran, M., Mazor, K., & Yu, H. (2021). Does defining medical jargon in a community hospital setting improve comprehension? In 2021 INFORMS Healthcare Conference.
  • [P12] Lalor, J. P., & Yu, H. (2020, November). Dynamic data selection for curriculum learning via ability estimation. In Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 545–555).
  • [P11] Ma, M. C., & Lalor, J. P. (2020, November). An empirical analysis of human-bot interaction on reddit. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020) (pp. 101–106).
  • [P10] Cho, E., Xie, H., Lalor, J. P., Kumar, V., & Campbell, W. M. (2019, December). Efficient semi-supervised learning for natural language understanding by optimizing diversity. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (pp. 1077–1084). IEEE.
  • [P9] Lalor, J. P., Wu, H., & Yu, H. (2019). Comparing human and DNN-ensemble response patterns for item response theory model fitting. In the 2019 Workshop on Cognitive Modeling and Computational Linguistics.
  • [P8] Lalor, J. P., Wu, H., & Yu, H. (2018). Modeling difficulty to understand deep learning performance. In the 2018 Northern Lights Deep Learning Workshop (NLDL).
  • [P7] Lalor, J. P., Wu, H., & Yu, H. (2018). Soft label memorization-generalization for natural language inference. In 2018 UAI Workshop on Uncertainty in Deep Learning.
  • [P6] Lalor, J. P., Wu, H., & Yu, H. (2017). CIFT: Crowd-informed fine-tuning to improve machine learning ability. In Human Computation and Crowdsourcing (HCOMP).
  • [P5] Munkhdalai, T., Lalor, J. P., & Yu, H. (2016, November). Citation analysis with neural attention models. In Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis (pp. 69–77).
  • [P4] Miller, C. S., Settle, A., & Lalor, J. P. (2015, September). Learning object-oriented programming in python: Towards an inventory of difficulties and testing pitfalls. In Proceedings of the 16th Annual Conference on Information Technology Education (pp. 59–64).
  • [P3] Settle, A., Lalor, J. P., & Steinbach, T. (2015, September). Evaluating a linked-courses learning community for development majors. In Proceedings of the 16th Annual Conference on Information Technology Education (pp. 127–132).
  • [P2] Settle, A., Lalor, J. P., & Steinbach, T. (2015, June). A computer science linked-courses learning community. In Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education (pp. 123–128).
  • [P1] Settle, A., Lalor, J. P., & Steinbach, T. (2015, February). Reconsidering the impact of CS1 on novice attitudes. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education (pp. 229–234).

Working Papers

  • [7] Lalor, J. P., Guo, H., Recker, J., Berente, N., & Abbasi, A. Measuring algorithmic interpretability: A human-learning-based framework and corresponding cognitive complexity score.
  • [6] Costello, J., Chen, Y., Lalor, J. P., & Guo, R. Rate before you write: How the presence and positioning of multidimensional attribute ratings influence attrition in online reviews.
  • [5] Lalor, J. P., & Qu, X. On the production and spread of news in a digital age.
  • [4] Cook, R.,† Lalor, J. P., & Abbasi, A. CADE: Classification with automatic difficulty estimation.
  • [3] Lalor, J. P., Kanuri, V., & Chakraborty, I. FEWD: A fused explainable model using wide and deep networks for synthesizing multi-modal content.
  • [2] Lalor, J. P., & Just, R. Ranking pull requests in open source software.
  • [1] Lim, J. H., Kwon, S., Yao, Z., Lalor, J. P., & Yu, H. Large language model-based role-playing for personalized medical jargon extraction. https://arxiv.org/abs/2408.05555