no code implementations • EMNLP 2021 • Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han
This paper presents an empirical study to efficiently build named entity recognition (NER) systems when a small amount of in-domain labeled data is available.
no code implementations • 7 May 2024 • Yongqi Tong, Sizhe Wang, Dawei Li, Yifan Wang, Simeng Han, Zi Lin, Chengsong Huang, Jiaxin Huang, Jingbo Shang
Therefore, we present \textsc{PuzzleBen}, a weakly supervised benchmark that comprises 25, 147 complex questions, answers, and human-generated rationales across various domains, such as brainteasers, puzzles, riddles, parajumbles, and critical reasoning tasks.
no code implementations • 26 Mar 2024 • Jingyu Xu, Binbin Wu, Jiaxin Huang, Yulu Gong, Yifan Zhang, Bo Liu
With the explosive growth and diversification of medical data, as well as the continuous improvement of medical needs and challenges, artificial intelligence technology is playing an increasingly important role in the medical field.
no code implementations • 20 Mar 2024 • Yulu Gong, Jiaxin Huang, Bo Liu, Jingyu Xu, Binbin Wu, Yifan Zhang
Overall, the paragraph effectively communicates the importance of machine learning technology in addressing resource allocation and virtual machine migration challenges in cloud computing.
no code implementations • 27 Feb 2024 • Yifan Zhang, Bo Liu, Yulu Gong, Jiaxin Huang, Jingyu Xu, Weixiang Wan
Each user requests the cloud computing center to use a certain number of cloud resources at a specific time.
no code implementations • 26 Jan 2024 • Shulin Li, Liqiang Yu, Bo Liu, Qunwei Lin, Jiaxin Huang
However, at present, there are few studies on the diagnosis of early lung cancer by AI technology combined with SCT scanning.
no code implementations • 9 Jan 2024 • Hector A. Gonzalez, Jiaxin Huang, Florian Kelber, Khaleelulla Khan Nazeer, Tim Langer, Chen Liu, Matthias Lohrmann, Amirhossein Rostami, Mark Schöne, Bernhard Vogginger, Timo C. Wunderlich, Yexin Yan, Mahmoud Akl, Christian Mayr
This development is accompanied by a rapid growth of the required computational demands for larger models and more data.
no code implementations • 6 Jan 2024 • Jiaxin Huang, Xinyu Zhao, Chang Che, Qunwei Lin, Bo Liu
To address the specific needs of ELLs, we propose the use of DeBERTa, a state-of-the-art neural language model, for improving automated feedback tools.
no code implementations • 11 Oct 2023 • Siru Ouyang, Jiaxin Huang, Pranav Pillai, Yunyi Zhang, Yu Zhang, Jiawei Han
In this study, we propose OnEFET, where we (1) enrich each node in the ontology structure with two types of extra information: instance information for training sample augmentation and topic information to relate types to contexts, and (2) develop a coarse-to-fine typing algorithm that exploits the enriched information by training an entailment model with contrasting topics and instance-based augmented training samples.
no code implementations • ICCV 2023 • Xinya Chen, Jiaxin Huang, Yanrui Bin, Lu Yu, Yiyi Liao
Unsupervised learning of 3D-aware generative adversarial networks has lately made much progress.
1 code implementation • 6 Nov 2022 • Yu Meng, Martin Michalski, Jiaxin Huang, Yu Zhang, Tarek Abdelzaher, Jiawei Han
In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set.
no code implementations • 20 Oct 2022 • Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han
We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74. 4%->82. 1% on GSM8K, 78. 2%->83. 0% on DROP, 90. 0%->94. 4% on OpenBookQA, and 63. 4%->67. 9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label.
Ranked #1 on Question Answering on DROP
1 code implementation • 28 Jun 2022 • Jiaxin Huang, Yu Meng, Jiawei Han
We study the problem of few-shot Fine-grained Entity Typing (FET), where only a few annotated entity mentions with contexts are given for each entity type.
no code implementations • 26 May 2022 • Xia Liu, Geng Liu, Jiaxin Huang, Hao-Kai Zhang, Xin Wang
Variational quantum algorithms (VQAs) are expected to establish valuable applications on near-term quantum computers.
no code implementations • 22 May 2022 • Jiaxin Huang, Tianqi Liu, Jialu Liu, Adam D. Lelkes, Cong Yu, Jiawei Han
Multi-Task Learning (MTL) models have shown their robustness, effectiveness, and efficiency for transferring learned knowledge across tasks.
1 code implementation • 9 Feb 2022 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Jiawei Han
Interestingly, there have not been standard approaches to deploy PLMs for topic discovery as better alternatives to topic models.
1 code implementation • 9 Feb 2022 • Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e. g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e. g., BERT) have been the prominent choice for natural language understanding (NLU) tasks.
Ranked #5 on Zero-Shot Text Classification on AG News
no code implementations • 17 Oct 2021 • Suyu Ge, Jiaxin Huang, Yu Meng, Sharon Wang, Jiawei Han
Opinion summarization aims to profile a target by extracting opinions from multiple documents.
1 code implementation • EMNLP 2021 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Xuan Wang, Yu Zhang, Heng Ji, Jiawei Han
We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base.
2 code implementations • 29 Dec 2020 • Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, Jiawei Han
This paper presents a comprehensive study to efficiently build named entity recognition (NER) systems when a small number of in-domain labeled data is available.
2 code implementations • EMNLP 2020 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, Jiawei Han
In this paper, we explore the potential of only using the label name of each class to train classification models on unlabeled data, without using any labeled documents.
1 code implementation • EMNLP 2020 • Jiaxin Huang, Yu Meng, Fang Guo, Heng Ji, Jiawei Han
Aspect-based sentiment analysis of review texts is of great value for understanding user feedback in a fine-grained manner.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
1 code implementation • 13 Oct 2020 • Jiaxin Huang, Yiqing Xie, Yu Meng, Yunyi Zhang, Jiawei Han
Taxonomy is not only a fundamental form of knowledge representation, but also crucial to vast knowledge-rich applications, such as question answering and web search.
1 code implementation • 18 Jul 2020 • Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Chao Zhang, Jiawei Han
Mining a set of meaningful topics organized into a hierarchy is intuitively appealing since topic correlations are ubiquitous in massive text corpora.
Ranked #1 on Topic Models on Arxiv HEP-TH citation graph
1 code implementation • 1 May 2020 • Yu Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang, Jiawei Han
Then, based on the same generative process, we synthesize training samples to address the bottleneck of label scarcity.
1 code implementation • 27 Jan 2020 • Jiaxin Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang, Jiawei Han
Given a small set of seed entities (e. g., ``USA'', ``Russia''), corpus-based set expansion is to induce an extensive set of entities which share the same semantic class (Country in this example) from a given corpus.
1 code implementation • NeurIPS 2019 • Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, Jiawei Han
While text embeddings are typically learned in the Euclidean space, directional similarity is often more effective in tasks such as word similarity and document clustering, which creates a gap between the training stage and usage stage of text embedding.
1 code implementation • 20 Aug 2019 • Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang, Jiawei Han
We propose a new task, discriminative topic mining, which leverages a set of user-provided category names to mine discriminative topics from text corpora.