Datasets and Benchmarks Track Papers

DOI links will be available by August 3^rd, please check back then to access the direct links below

BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction
DOI: 10.1145/3711896.3737372

Ruifeng Tan (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Weixiang Hong (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Jiayue Tang (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Xibin Lu (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Ruijun Ma (CALB Group Co., Ltd.); Xiang Zheng (CALB Group Co., Ltd.); Jia Li (The Hong Kong University of Science and Technology (Guangzhou)); Jiaqiang Huang (The Hong Kong University of Science and Technology (Guangzhou)); Tong-Yi Zhang (The Hong Kong University of Science and Technology (Guangzhou))

Bridging the Binary Analysis Gap: A Cross-Compiler Dataset and Neural Framework for Industrial Control Systems
DOI: 10.1145/3711896.3737373

Yonatan G. Achamyeleh (University of California, Irvine); Shih-Yuan Yu (University of California, Irvine); Gustavo Q. Araya (Siemens Corporate Research); Mohammad A. Al Faruque (University of California, Irvine)

ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models
DOI: 10.1145/3711896.3737374

Haibin Chen (Taobao & Tmall Group of Alibaba); Kangtao Lv (Zhejiang University); Chengwei Hu (Taobao & Tmall Group of Alibaba); Yanshi Li (Taobao & Tmall Group of Alibaba); Yujin Yuan (Taobao & Tmall Group of Alibaba); Yancheng He (Taobao & Tmall Group of Alibaba); Xingyao Zhang (Taobao & Tmall Group of Alibaba); Langming Liu (Taobao & Tmall Group of Alibaba); Shilei Liu (Taobao & Tmall Group of Alibaba); Wenbo Su (Taobao & Tmall Group of Alibaba); Bo Zheng (Taobao & Tmall Group of Alibaba)

CityBench: Evaluating the Capabilities of Large Language Models for Urban Tasks
DOI: 10.1145/3711896.3737375

Jie Feng (Department of Electronic Engineering, BNRist, Tsinghua University); Jun Zhang (Department of Electronic Engineering, BNRist, Tsinghua University); Tianhui Liu (School of Electronic and Information Engineering, Beijing Jiaotong University); Xin Zhang (Shenzhen International Graduate School, Tsinghua University); Tianjian Ouyang (Department of Electronic Engineering, BNRist, Tsinghua University); Junbo Yan (Department of Electronic Engineering, BNRist, Tsinghua University); Yuwei Du (Department of Electronic Engineering, BNRist, Tsinghua University); Siqi Guo (Department of Electronic Engineering, Tsinghua University); Yong Li (Department of Electronic Engineering, BNRist, Tsinghua University)

Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark
DOI: 10.1145/3711896.3737376

Suyeon Kim (Pohang University of Science and Technology); SeongKu Kang (Korea University); Dongwoo Kim (Pohang University of Science and Technology); Jungseul Ok (Pohang University of Science and Technology); Hwanjo Yu (Pohang University of Science and Technology)

HiBench: Benchmarking LLMs Capability on Hierarchical Structure Reasoning
DOI: 10.1145/3711896.3737378

Zhuohang Jiang (Department of Computing, The Hong Kong Polytechnic University,School of Computer Science, Sichuan University); Pangjing Wu (Derpartment of Computing, The Hong Kong Polytechnic University,College of Computer and Information, Hohai University); Ziran Liang (Derpartment of Computing, The Hong Kong Polytechnic University,Sun Yat-sen University); Peter Q. Chen (Department of Computing, The Hong Kong Polytechnic University); Xu Yuan (Department of Computing, The Hong Kong Polytechnic University,Harbin Institute of Technology); Ye Jia (Department of Computing, The Hong Kong Polytechnic University); Tu Jiancheng (Department of Computing, The Hong Kong Polytechnic University); Chen Li (Department of Computing, The Hong Kong Polytechnic University,Department of Applied Social Sciences, The Hong Kong Polytechnic University); Peter H. F. Ng (Department of Computing, The Hong Kong Polytechnic University); Qing Li (Department of Computing, The Hong Kong Polytechnic University,Department of Computer Science, City University of Hong Kong)

HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark
DOI: 10.1145/3711896.3737379

Jianqing Zhang (Shanghai Jiao Tong University,Institute for AI Industry Research, Tsinghua University); Xinghao Wu (Beijing University); Yanbing Zhou (Chongqing University); Xiaoting Sun (Tongji University); Qiqi Cai (Shanghai Jiao Tong University); Yang Liu (Hong Kong Polytechnic University,Shanghai Artificial Intelligence Laboratory); Yang Hua (The Queen’s University of Belfast); Zhenzhe Zheng (Shanghai Jiao Tong University); Jian Cao (Shanghai Jiao Tong University,Shanghai Key Laboratory of Trusted Data Circulation and Governance in Web3); Qiang Yang (Hong Kong Polytechnic University)

IrrMap: A Large-Scale Comprehensive Dataset for Irrigation Method Mapping
DOI: 10.1145/3711896.3737380

Nibir Chandra Mandal (Dept. of Computer Science, University of Virginia); Oishee Bintey Hoque (Dept. of Computer Science, University of Virginia); Abhijin Adiga (Biocomplexity Institute, University of Virginia); Samarth Swarup (Biocomplexity Institute, University of Virginia); Mandy L. Wilson (Biocomplexity Institute, University of Virginia); Lu Feng (Dept. of Computer Science, University of Virginia); Yangfeng Ji (Dept. of Computer Science, University of Virginia); Miaomiao Zhang (Dept. of Computer Science, Dept. of Electrical and Computer Engineering, University of Virginia); Geoffrey Fox (Dept. of Computer Science, Biocomplexity Institute, University of Virginia); Madhav Marathe (Dept. of Computer Science, Biocomplexity Institute, University of Virginia)

IVMR suite: An Industrial-scale Virtual Machine Rescheduling Dataset and Benchmark for Elastic Cloud Service
DOI: 10.1145/3711896.3737381

Yupeng Zhang (Alibaba Damo Academy, Alibaba Group); Xu Wan (Zhejiang University,DAMO Academy, Alibaba Group); Xiangyun Kong (Alibaba Cloud Intelligence Group, Alibaba Group); Chao Yang (Alibaba DAMO Academy, Alibaba Group); Binda Ma (Alibaba Cloud Intelligence Group, Alibaba Group); Wotao Yin (DAMO Academy, Alibaba Group); Jian Zhou (Alibaba Cloud Intelligence Group, Alibaba Group)

ComputAgeBench: Epigenetic Aging Clocks Benchmark
DOI: 10.1145/3711896.3737382

Dmitrii Kriukov (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology); Evgeniy Efimov (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology); Ekaterina Kuzmina (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology); Anastasiia Dudkovskaia (Skolkovo Institute of Science and Technology,Higher School of Economics); Ekaterina E. Khrameeva (Skolkovo Institute of Science and Technology); Dmitry V. Dylov (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology)

Evaluating and Generating Query Workloads for High Dimensional Vector Similarity Search
DOI: 10.1145/3711896.3737383

Matteo Ceccarello (University of Padua); Alexandra Levchenko (Isep, LISITE); Ioana Ileana (Université Paris Cité); Themis Palpanas (Universite Paris Cité)

FoodPuzzle: Toward Developing Large Language Models as Autonomous Flavor Scientists
DOI: 10.1145/3711896.3737384

Dong Hee Lee,Emily Steliotes,Jiatong Shi,John Sweeney,Jonathan May,Matthew Lange,Muhao Chen,Tenghao Huang

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering
DOI: 10.1145/3711896.3737385

Langming Liu (Taobao & Tmall Group of Alibaba); Shilei Liu (Taobao & Tmall Group of Alibaba); Yujin Yuan (Taobao & Tmall Group of Alibaba); Yizhen Zhang (Taobao & Tmall Group of Alibaba); Bencheng Yan (Taobao & Tmall Group of Alibaba); Zhiyuan Zeng (Taobao & Tmall Group of Alibaba); Zihao Wang (Taobao & Tmall Group of Alibaba); Jiaqi Liu (Taobao & Tmall Group of Alibaba); Di Wang (Taobao & Tmall Group of Alibaba); Wenbo Su (Taobao & Tmall Group of Alibaba); Pengjie Wang (Taobao & Tmall Group of Alibaba); Jian Xu (Taobao & Tmall Group of Alibaba); Bo Zheng (Taobao & Tmall Group of Alibaba)

A Framework for Evaluating AI Agents in Open-Ended Conversations via Scripted Simulation
DOI: 10.1145/3711896.3737390

Clarice Wang (University of Pennsylvania); Yimin Shi (National University of Singapore); Xiaokui Xiao (National University of Singapore)

Differentially Private Synthetic Data Release for Topics API Outputs
DOI: 10.1145/3711896.3737391

Travis Dick (Google Research); Alessandro Epasto (Google Research); Adel Javanmard (University of Southern California,Google Research); Josh Karlin (Google Chrome); Andrés Muñoz Medina (Google Chrome); Vahab Mirrokni (Google Research); Sergei Vassilvitskii (Google Research); Peilin Zhong (Google Research)

Fairness-Aware Graph Learning: A Benchmark
DOI: 10.1145/3711896.3737392

Yushun Dong (Florida State University); Song Wang (The University of Virginia); Zhenyu Lei (The University of Virginia); Zaiyi Zheng (The University of Virginia); Jing Ma (Case Western Reserve University); Chen Chen (The University of Virginia); Jundong Li (The University of Virginia)

MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance
DOI: 10.1145/3711896.3737393

George Demiris,Jia Xu,Joost Wagenaar,Li Shen,Patryk Orzechowski,Rachael Paulbeck,Ruochen Jin,Shu Yang,Tianyi Wei,Bojian Hou

MonoDeMB: Comprehensive Monocular DepthMap Benchmark
DOI: 10.1145/3711896.3737394

Vaagn Chopuryan (Sber AI); Mikhail Kuznetsov (Sber AI,Skolkovo Institute of Science and Technology); Vasilii Latonov (Sber AI); Vladimir Mashurov (Sber AI,ITMO National Research University); Natalia Semenova (Sber AI,Artificial Intelligence Research Institute)

ZooplanktonBench: A Geo-Aware Zooplankton Recognition and Classification Dataset from Marine Observations
DOI: 10.1145/3711896.3737395

Fukun Liu (University of Georgia); Adam T. Greer (University of Georgia); Gengchen Mai (University of Texas at Austin,University of Georgia); Jin Sun (University of Georgia)

Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding
DOI: 10.1145/3711896.3737396

Dawei Zhou (Computer Science, Virginia Polytechnic Institute and State University); Feng Guo (Statistics, Virginia Polytechnic Institute and State University,Virginia Tech Transportation Institute, Virginia Polytechnic Institute and State University); Liang Shi (Statistics, Virginia Polytechnic Institute and State University,Virginia Tech Transportation Institute, Virginia Polytechnic Institute and State University); Longfeng Wu (Computer Science, Virginia Polytechnic Institute and State University); Tong Zeng (Computer Science, Virginia Polytechnic Institute and State University)

Neurophysiologically Realistic Environment for Comparing Adaptive Deep Brain Stimulation Algorithms in Parkinson’s Disease
DOI: 10.1145/3711896.3737397

Ekaterina Kuzmina (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology); Dmitrii Kriukov (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology); Mikhail Lebedev (Lomonosov Moscow State University); Dmitry V. Dylov (Artificial Intelligence Research Institute,Skolkovo Institute of Science and Technology)

Flexible Generation of Preference Data for Recommendation Analysis
DOI: 10.1145/3711896.3737398

Simone Mungari (University of Calabria,ICAR-CNR); Erica Coppolillo (University of Calabria,ICAR-CNR); Ettore Ritacco (University of Udine); Giuseppe Manco (ICAR-CNR)

HiDF: A Human-Indistinguishable Deepfake Dataset
DOI: 10.1145/3711896.3737399

Chaewon Kang (Sungkyunkwan University); Seoyoon Jeong (Sungkyunkwan University); Jonghyun Lee (Sungkyunkwan University); Daejin Choi (Incheon National University,Ewha Women’s University); Simon S. Woo (Sungkyunkwan University); Jinyoung Han (Sungkyunkwan University)

Simulated Infectious Diseases Datasets with Controlled Data Bias
DOI: 10.1145/3711896.3737401

Ruochen Kong (Computer Science, Emory University); Taylor Anderson (George Mason University); Matthew Scotch (Arizona State University); David J. Heslop (School of Population Health, University of New South Wales); Yonchanok Khaokaew (University of New South Wales); Hao Xue (University of New South Wales); Li Xiong (Emory University); Chandini Raina MacIntyre (University of New South Wales); Flora D. Salim (University of New South Wales); Andreas Züfle (Emory University)

SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
DOI: 10.1145/3711896.3737403

Chuan Qin (Department of Big Data Technology and Application Development, Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences); Xin Chen (Big Data Technology and Development, Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences); Chengrui Wang (Computer Network Information Center, Chinese Academy of Sciences); Pengmin Wu (Big Data Technology and Development, Computer Network Information Center, Chinese Academy of Sciences); Xi Chen (School of Computer Science and Technology, University of Science and Technology of China,Computer Network Information Center, Chinese Academy of Sciences); Yihang Cheng (Computer Network Information Center, Chinese Academy of Sciences); Jingyi Zhao (Computer Network Information Center, Chinese Academy of Sciences); Meng Xiao (Computer Network Information Center, Chinese Academy of Sciences); Xiangchao Dong (Computer Network Information Center, Chinese Academy of Sciences); Qingqing Long (Computer Network Information Center, Chinese Academy of Sciences); Boya Pan (Computer Network Information Center, Chinese Academy of Sciences); Han Wu (School of Computer Science and Information Engineering, Hefei University of Technology); Chengzan Li (Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences); Yuanchun Zhou (Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences); Hui Xiong (Thrust of Artificial Intelligence, Hong Kong University of Science and Technology (Guangzhou),Department of Computer Science and Engineering, The Hong Kong University of Science and Technology); Hengshu Zhu (Computer Network Information Center, Chinese Academy of Sciences,University of Chinese Academy of Sciences)

When Graph Meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graph Learning
DOI: 10.1145/3711896.3737404

Hao Yan (Central South University); Chaozhuo Li (Microsoft Research Asia); Jun Yin (Central South University); Zhigang Yu (Central South University); Weihao Han (Microsoft AI); Mingzheng Li (Microsoft AI); Zhengxin Zeng (Microsoft AI); Hao Sun (Microsoft AI); Senzhang Wang (Central South University)

Revolutionizing Database QA with Large Language Models: Comprehensive Benchmark and Evaluation
DOI: 10.1145/3711896.3737405

Yihang Zheng (Xiamen University); Bo Li (Xiamen University); Zhenghao Lin (Xiamen University); Yi Luo (Xiamen University); Xuanhe Zhou (Shanghai Jiaotong University); Chen Lin (Xiamen University); Guoliang Li (Tsinghua University); Jinsong Su (Xiamen University)

ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis
DOI: 10.1145/3711896.3737406

Jian Chen (Thrust of Artificial Intelligence, The Hong Kong University of Science and Technology (Guangzhou),HSBC); Peilin Zhou (Thrust of Data Science and Analytics, The Hong Kong University of Science and Technology (Guangzhou)); Yining Hua (Harvard University); Dading Chong (Peking University); Meng Cao (Mohamed bin Zayed University of Artificial Intelligence); Yaowei Li (Harvard University); Wei Chen (Thrust of Data Science and Analytics, The Hong Kong University of Science and Technology (Guangzhou)); Bing Zhu (HSBC); Junwei Liang (Thrust of Artificial Intelligence, The Hong Kong University of Science and Technology (Guangzhou)); Zixuan Yuan (Thrust of Financial Technology, The Hong Kong University of Science and Technology (Guangzhou))

T3Set: A Multimodal Dataset with Targeted Suggestions for LLM-based Virtual Coach in Table Tennis Training
DOI: 10.1145/3711896.3737407

Ji Ma (State Key Lab of CAD&CG, Zhejiang University); Jiale Wu (State Key Lab of CAD&CG, Zhejiang University); Haoyu Wang (State Key Lab of CAD&CG, Zhejiang University); Yanze Zhang (State Key Lab of CAD&CG, Zhejiang University); Xiao Xie (Department of Sports Science, Zhejiang University); Zheng Zhou (Department of Sports Science, Zhejiang University); Hui Zhang (Department of Sports Science, Zhejiang University); Jiachen Wang (Department of Sports Science, Zhejiang University); Yingcai Wu (State Key Lab of CAD&CG, Zhejiang University)

On the Generalization and Adaptation Ability of Machine-Generated Text Detectors in Academic Writing
DOI: 10.1145/3711896.3737408

Yule Liu (DSA, The Hong Kong University of Science and Technology); Zhiyuan Zhong (The Hong Kong University of Science and Technology (Guangzhou),Southern University of Science and Technology); Yifan Liao (CS, National University of Singapore,Chongqing University); Zhen Sun (Infomation Hub, The Hong Kong University of Science and Technology (Guangzhou)); Jingyi Zheng (IOT, The Hong Kong University of Science and Technology (Guangzhou)); Jiaheng Wei (Data Science and Analytics, The Hong Kong University of Science and Technology (Guangzhou)); Qingyuan Gong (Research Institute of Intelligent Complex Systems, Fudan University,School of Computer Science, Fudan University); Fenghua Tong (Qilu University of Technology); Yang Chen (Fudan University); Yang Zhang (Group Zhang, CISPA Helmholtz Center for Information Security); Xinlei He (DSA & IoT Thrust, The Hong Kong University of Science and Technology (Guangzhou))

Judge Anything: MLLM as a Judge Across Any Modality
DOI: 10.1145/3711896.3737409

Shu Pu (Huazhong University of Science and Technology); Yaochen Wang (Huazhong University of Science and Technology); Dongping Chen (Huazhong University of Science and Technology); Yuhang Chen (Huazhong University of Science and Technology); Guohao Wang (Huazhong University of Science and Technology); Qi Qin (Huazhong University of Science and Technology); Zhongyi Zhang (Huazhong University of Science and Technology); Zhiyuan Zhang (Huazhong University of Science and Technology); Zetong Zhou (Huazhong University of Science and Technology); Shuang Gong (Huazhong University of Science and Technology); Yi Gui (Huazhong University of Science and Technology); Yao Wan (Huazhong University of Science and Technology); Philip S. Yu (University of Illinois Chicago)

Benchmarking Graph Foundation Models
DOI: 10.1145/3711896.3737410

Jinyu Yang (Beijing University of Posts and Telecommunications); Liangwei Yang (University of Illinois Chicago); Zeyuan Guo (Beijing University of Posts and Telecommunications); Jiayi Gao (Beijing University of Posts and Telecommunications); Jing Wu (Beijing University of Posts and Telecommunications); Tianhao Chai (Beijing University of Posts and Telecommunications); Hai Huang (Beijing University of Posts and Telecommunications); Cheng Yang (Beijing University of Posts and Telecommunications); Chuan Shi (Beijing University of Posts and Telecommunications)

VFLAIR-LLM: A Comprehensive Framework and Benchmark for Split Learning of LLMs
DOI: 10.1145/3711896.3737411

Zixuan Gu (School of Software, Tsinghua University); Qiufeng Fan (Privacy Computing, Wuxi Innovation Center of Tsinghua AIR); Long Sun (Privacy Computing, Wuxi Innovation Center of Tsinghua AIR); Yang Liu (the Hong Kong Polytechnic University,the Shanghai Artificial Intelligence Laboratory); Xiaojun Ye (School of Software, Tsinghua University)

BurstGPT: A Real-World Workload Dataset to Optimize LLM Serving Systems
DOI: 10.1145/3711896.3737413

Yuxin Wang (Huawei Hong Kong Research Center); Yuhan Chen (The Hong Kong University of Science and Technology (Guangzhou)); Zeyu Li (The Hong Kong University of Science and Technology (Guangzhou)); Xueze Kang (Hong Kong University of Science and Technology (Guangzhou)); Yuchu Fang (Huawei Technologies Ltd.); Yeju Zhou (Huawei Technologies Ltd.); Yang Zheng (Huawei Technologies Ltd.); Zhenheng Tang (The Hong Kong University of Science and Technology); Xin He (Hong Kong Baptist University); Rui Guo (Tsinghua University); Xin Wang (Tsinghua University); Qiang Wang (Harbin Institute of Technology, Shenzhen); Amelie Chi Zhou (Hong Kong Baptist University); Xiaowen Chu (Hong Kong University of Science and Technology (Guangzhou))

Saliency-Bench: A Comprehensive Benchmark for Evaluating Visual Explanations
DOI: 10.1145/3711896.3737414

Yifei Zhang (Emory University); James Song (Emory University); Siyi Gu (Stanford University); Tianxu Jiang (University of Michigan – Ann Arbor); Bo Pan (Emory University); Guangji Bai (Emory University); Liang Zhao (Emory University)

MethaneS2CM: A Dataset for Multispectral Deep Methane Emission Detection
DOI: 10.1145/3711896.3737415

Hongxuan Liu (Department of Electrical and Computer Engineering, University of Alberta); Juliana Y. Leung (Department of Civil and Environmental Engineering, University of Alberta); Di Niu (Department of Electrical and Computer Engineering, University of Alberta)

MetamatBench: Integrating Heterogeneous Data, Computational Tools, and Visual Interface for Metamaterial Discovery
DOI: 10.1145/3711896.3737416

Jianpeng Chen (Computer Science, Virginia Polytechnic Institute and State University); Wangzhi Zhan (Computer Science, Virginia Polytechnic Institute and State University); Haohui Wang (Computer Science, Virginia Polytechnic Institute and State University); Zian Jia (Materials Science and Engineering, University of Pennsylvania,Ecology and Evolutionary Biology, Princeton University); Jingru Gan (Computer Science, University of California, Los Angeles); Junkai Zhang (Computer Science, University of California, Los Angeles); Jingyuan Qi (Computer Science, Virginia Polytechnic Institute and State University); Tingwei Chen (EECS, University of Tennessee, Knoxville); Lifu Huang (Computer Science, University of California, Davis); Muhao Chen (Computer Science, University of California, Davis); Ling Li (University of Pennsylvania); Wei Wang (Computer Science, University of California, Los Angeles); Dawei Zhou (Computer Science, Virginia Polytechnic Institute and State University)

VideoConviction: A Multimodal Benchmark for Human Conviction and Stock Market Recommendations
DOI: 10.1145/3711896.3737417

Michael Galarnyk (Georgia Institute of Technology); Veer Kejriwal (Georgia Institute of Technology); Agam Shah (Georgia Institute of Technology); Yash Bhardwaj (Georgia Institute of Technology); Nicholas Watney Meyer (Georgia Institute of Technology); Anand Krishnan (Stanford University); Sudheer Chava (Georgia Institute of Technology)

TH-Bench: Evaluating Evading Attacks via Humanizing AI Text on Machine-Generated Text Detectors
DOI: 10.1145/3711896.3737418

Jingyi Zheng (IOT, The Hong Kong University of Science and Technology (Guangzhou)); Junfeng Wang (The Hong Kong University of Science and Technology (Guangzhou)); Wenhan Dong (AI, The Hong Kong University of Science and Technology (Guangzhou)); Xinlei He (DSA & IoT Thrust, The Hong Kong University of Science and Technology (Guangzhou)); Yule Liu (DSA, The Hong Kong University of Science and Technology (Guangzhou)); Zhen Sun (Infomation Hub, The Hong Kong University of Science and Technology (Guangzhou))

IdeaBench: Benchmarking Large Language Models for Research Idea Generation
DOI: 10.1145/3711896.3737419

Aidong Zhang (Computer Science, University of Virginia); Albert Huang (Department of Computer Science, University of Virginia, Charlottesville); Amir Hassan Shariatmadari (Computer Science, University of Virginia, Charlottesville); Corey M. Williams (Immunology & Biomedical Engineering, University of Virginia, Charlottesville); Guangzhi Xiong (Computer Science, University of Virginia, Charlottesville,English Language, Tsinghua University); Myles Kim (University of Virginia, Charlottesville); Sikun Guo (Computer Science, University of Virginia, Charlottesville,College of Electronic Information and Optical Engineering , Nankai University); Stefan Bekiranov (Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville,Gaasterland Lab, Rockefeller University)

Exploring the Potential of Foundation Models as Reliable AI Contact Centers
DOI: 10.1145/3711896.3737420

Hoyoon Byun (Department of Applied Statistics and Data Science, Yonsei University); Minhoi Park (Institute of Data Science, Yonsei University); Seolah Kim (120 Dasan Call Foundation); EunBi Kim (120 Dasan Call Foundation); Kyungwoo Song (Department of Applied Statistics and Data Science, Yonsei University)

When Heterophily Meets Heterogeneity: Challenges and a New Large-Scale Graph Benchmark
DOI: 10.1145/3711896.3737421

Junhong Lin (Massachusetts Institute of Technology,Huazhong University of Science and Technology); Xiaojie Guo (IBM Research,George Mason University); Shuaicheng Zhang (Virginia Tech,University of Maryland, College Park); Yada Zhu (IBM Research,Rutgers University); Julian Shun (Massachusetts Institute of Technology)

DCA-Bench: A Benchmark for Dataset Curation Agents
DOI: 10.1145/3711896.3737422

Benhao Huang (Carnegie Mellon University); Yingzhuo Yu (University of Illinois at Urbana-Champaign); Jin Huang (University of Michigan – Ann Arbor); Xingjian Zhang (University of Michigan – Ann Arbor); Jiaqi W. Ma (University of Illinois Urbana-Champaign)

UP-Bench: A Benchmark for Underwater Path Planning Algorithms
DOI: 10.1145/3711896.3737424

Di Yang (Data Science, College of William and Mary); Yanhai Xiong (Data Science, College of William and Mary)

Capillary Dataset: A Dataset of Nail-fold Capillaries Captured by Microscopy for Diabetes Detection
DOI: 10.1145/3711896.3737425

Hang Thi Phuong Nguyen (AI Convergence, Chonnam National University); Hieyong Jeong (Artificial Intelligence Convergence, Chonnam National University,Graduate School of Medicine, Osaka University)

$EFO_{k}$-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation
DOI: 10.1145/3711896.3737426

Hang Yin (Department of Mathematical Science, Tsinghua University); Zihao Wang (Department of Computer Science and Engineering, The Hong Kong University of Science and Technology); Weizhi Fei (Department of Mathematical Science, Tsinghua University); Yangqiu Song (Department of Computer Science and Engineering, Hong Kong University of Science and Technology)

NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation
DOI: 10.1145/3711896.3737427

Xinyu Liu (The Hong Kong University of Science and Technology (Guangzhou)); Shuyu Shen (The Hong Kong University of Science and Technology (Guangzhou)); Boyan Li (The Hong Kong University of Science and Technology (Guangzhou)); Nan Tang (The Hong Kong University of Science and Technology (Guangzhou)); Yuyu Luo (The Hong Kong University of Science and Technology (Guangzhou))

EBES: Easy Benchmarking for Event Sequences
DOI: 10.1145/3711896.3737428

Dmitry Osin (Skolkovo Institute of Science and Technology); Egor Shvetsov (Skolkovo Institute of Science and Technology); Evgeny Burnaev (Skolkovo Institute of Science and Technology,Artificial Intelligence Research Institute); Igor Udovichenko (Skolkovo Institute of Science and Technology,Vega Institute Foundation); Viktor Moskvoretskii (Skolkovo Institute of Science and Technology,Higher School of Economics)

TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks
DOI: 10.1145/3711896.3737429

Qihai Zhang (Center for Data Science, New York University); Xinyue Sheng (Data Science, New York University Shanghai); Yuanfu Sun (Courant Institute, New York University,School of Artificial Intelligence, Jilin University); Qiaoyu Tan (Computer Science, New York University Shanghai,Texas A&M University)

POVE: A Preoptimized Vault of Expressions for Symbolic Regression Research and Benchmarking
DOI: 10.1145/3711896.3737430

Kei Sen Fong (Department of Electrical and Computer Engineering, National University of Singapore); Mehul Motani (Department of Electrical and Computer Engineering, Institute of Data Science, N.1 Institute for Health, Institute for Digital Medicine (WisDM), National University of Singapore)

EMBER2024 — A Benchmark Dataset for Holistic Evaluation of Malware Classifiers
DOI: 10.1145/3711896.3737431

Robert J. Joyce (Booz Allen Hamilton); Gideon Miller (Laboratory for Physical Sciences); Phil Roth (Data Science, CrowdStrike); Richard J. Zak (Booz Allen Hamilton); Elliott Zaresky-Williams (Booz Allen Hamilton); Hyrum Anderson (Cisco Systems); Edward Raff (Booz Allen Hamilton); James Holt (Laboratory for Physical Sciences)

ScIRGen: Synthesize Realistic and Large-Scale RAG Dataset for Scientific Research
DOI: 10.1145/3711896.3737432

Junyong Lin (The Hong Kong University of Science and Technology (Guangzhou)); Lu Dai (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Ruiqian Han (The Hong Kong University of Science and Technology (Guangzhou)); Yijie Sui (Institute of Tibetan Plateau Research, Chinese Academy of Sciences); Ruilin Wang (Lanzhou University); Sun Xingliang (Lanzhou University); Qinglin Wu (Institute of Tibetan Plateau Research, Chinese Academy of Sciences); Min Feng (Institute of Tibetan Plateau Research, Chinese Academy of Sciences,College of Resources and Environment, University of Chinese Academy of Sciences); Hao Liu (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology); Hui Xiong (The Hong Kong University of Science and Technology (Guangzhou),The Hong Kong University of Science and Technology)

RL4CO: An Extensive Reinforcement Learning for Combinatorial Optimization Benchmark
DOI: 10.1145/3711896.3737433

Federico Berto (KAIST,OMELET); Chuanbo Hua (KAIST,OMELET); Junyoung Park (KAIST); Laurin Luttmann (Leuphana University); Yining Ma (MIT); Fanchen Bu (KAIST); Jiarui Wang (Southeast University); Haoran Ye (Peking University); Minsu Kim (Mila,KAIST); Sanghyeok Choi (KAIST); Nayeli Gast Zepeda (Bielefeld University); André Hottung (Bielefeld University); Jianan Zhou (Nanyang Technological University); Jieyi Bi (Nanyang Technological University); Yu Hu (Soochow University); Fei Liu (City University of Hong Kong); Hyeonah Kim (Mila,Université de Montréal); Jiwoo Son (OMELET); Haeyeon Kim (KAIST); Davide Angioni (University of Brescia); Wouter Kool (ORTEC); Zhiguang Cao (Singapore Management University); Qingfu Zhang (City University of Hong Kong); Joungho Kim (KAIST); Jie Zhang (Nanyang Technological University); Kijung Shin (KAIST); Cathy Wu (MIT); Sungsoo Ahn (KAIST); Guojie Song (Peking University); Changhyun Kwon (KAIST,OMELET); Kevin Tierney (Bielefeld University); Lin Xie (Brandenburg University of Technology); Jinkyoo Park (KAIST,OMELET)

Towards Understanding Link Predictor Generalizability Under Distribution Shifts
DOI: 10.1145/3711896.3737434

Jay Revolinsky (Michigan State University); Harry Shomer (Michigan State University); Jiliang Tang (Michigan State University)

CURE: A dataset for Clinical Understanding & Retrieval Evaluation
DOI: 10.1145/3711896.3737435

Nadia Athar Sheikh (Clinia); Daniel Buades Marcos (Clinia); Anne-Laure Jousse (Clinia); Akintunde Oladipo (Clinia); Olivier Rousseau (Clinia); Jimmy Lin (University of Waterloo)

MathWriting: A Dataset For Handwritten Mathematical Expression Recognition
DOI: 10.1145/3711896.3737436

Philippe Gervais (Inceptive); Anastasiia Fadeeva (Google DeepMind); Andrii Maksai (Google DeepMind)

A Guide to Misinformation Detection Data and Evaluation
DOI: 10.1145/3711896.3737437

Camille Thibault (Université de Montréal); Jacob-Junqi Tian (Vector Institute,Mila – Quebec Artificial Intelligence Institute); Gabrielle Péloquin-Skulski (Massachusetts Institute of Technology); Taylor Lynn Curtis (Mila – Quebec Artificial Intelligence Institute); James Zhou (University of California, Berkeley); Florence Laflamme (Université de Montréal); Yuxiang Guan (McMaster University); Reihaneh Rabbany (McGill University,Mila – Quebec Artificial Intelligence Institute); Jean-François Godbout (Université de Montréal); Kellin Pelrine (McGill University,Mila – Quebec Artificial Intelligence Institute)

Towards Better Benchmark Datasets for Inductive Knowledge Graph Completion
DOI: 10.1145/3711896.3737438

Harry Shomer (Michigan State University); Jay Revolinsky (Michigan State University); Jiliang Tang (Michigan State University)

TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery
DOI: 10.1145/3711896.3737439

Muhammad Hasan Ferdous (Information Systems, University of Maryland, Baltimore County); Emam Hossain (Information Systems, University of Maryland, Baltimore County); Md Osman Gani (Information Systems, University of Maryland, Baltimore County)

SatHealth: A Multimodal Public Health Dataset with Satellite-based Environmental Factors
DOI: 10.1145/3711896.3737440

Yuanlong Wang (The Ohio State University); Pengqi Wang (The Ohio State University); Changchang Yin (The Ohio State University); Ping Zhang (The Ohio State University)

BTS: A Comprehensive Benchmark for Tie Strength Prediction
DOI: 10.1145/3711896.3737441

Xueqi Cheng (Vanderbilt University); Catherine Yang (Vanderbilt University); Yuying Zhao (Vanderbilt University); Yu Wang (University of Oregon); Hamid Karimi (Utah State University); Tyler Derr (Vanderbilt University)

TSFM-Bench: A Comprehensive and Unified Benchmark of Foundation Models for Time Series Forecasting
DOI: 10.1145/3711896.3737442

Zhe Li (School of Data Science and Engineering, East China Normal University); Xiangfei Qiu (School of Data Science and Engineering, East China Normal University); Peng Chen (School of Data Science and Engineering, East China Normal University); Yihang Wang (School of Data Science and Engineering, East China Normal University); Hanyin Cheng (School of Data Science and Engineering, East China Normal University); Yang Shu (School of Data Science and Engineering, East China Normal University); Jilin Hu (School of Data Science and Engineering, East China Normal University); Chenjuan Guo (School of data science and engineering, East China Normal University); Aoying Zhou (School of Data Science and Engineering, East China Normal University); Christian S. Jensen (Department of Computer Science, Aalborg University); Bin Yang (School of Data Science and Engineering, East China Normal University,Department of Computer Science, Aalborg University)

FULTR: A Large-scale Fusion Learning to Rank Dataset and its application for Satisfaction-Oriented Ranking
DOI: 10.1145/3711896.3737443

Yuchen Li (Baidu Inc.); Hao Zhang (Baidu Inc.); Haojie Zhang (Baidu Inc.); Hengyi Cai (Baidu Inc.); Xinyu Ma (Baidu Inc.); Shuaiqiang Wang (Baidu Inc.); Haoyi Xiong (Baidu Inc.); Zhaochun Ren (Leiden University); Maarten de Rijke (University of Amsterdam); Dawei Yin (Baidu Inc.)

WikiRAG: Revisiting Wikidata KGC Datasets with Community Updates and Retrieval-Augmented Generation
DOI: 10.1145/3711896.3737444

Djellel Difallah (New York University Abu Dhabi)