The next step in
artificial text detection

The next step in artificial text detection

The next step in artificial text detection

NLP researchers and industry practitioners are seeking new benchmarks to detect artificial text. A joint project of MIT Lincoln Laboratory, Penn State University, University of Oslo, and Toloka proposes a novel approach to artificial text detectors. Contribute to building the benchmark, or use it for your NLP and GenAI projects.

What makes the benchmark different?

The new benchmark compares LLM-generated texts to human-edited versions. Texts cover a range of use cases from creating writing to summarizing news articles.

The new benchmark compares LLM-generated texts to human-edited versions. Texts cover a range of use cases from creating writing to summarizing news articles.

How can I contribute to the benchmark?

1

Make edits

Edit the model's output to make it sound more human — change, remove, or add text to align with your preferences.

2

Submit the new version of the text

3

Access the benchmark when it's ready to use

1

2

3

Make edits

Edit the model's output to make it sound more human — change, remove, or add text to align with your preferences.

Submit the new
version of the text

Access the benchmark
when it's ready to use

Sign up to access the benchmark

Leave your email, and you'll be the first to know when the benchmark is ready. You can use it to build and test next-generation artificial text detectors

Sign up to access the benchmark

Leave your email, and you'll be the first to know when the benchmark is ready. You can use it to build and test next-generation artificial text detectors

Sign up to access the benchmark

Leave your email, and you'll be the first to know when the benchmark is ready. You can use it to build and test next-generation artificial text detectors

  • Jooyoung Lee

    “Amidst the proliferation of AI-generated content, it is critical to distinguish between artificial and human-authored text. Your contribution will help us build automated systems essential for gauging credibility and upholding the integrity of information sharing!”

    Penn State University

  • Ekaterina Artemova

    “At Toloka, we value the quality of the text that we produce for our community. We need your help to avoid AI-generated texts contaminating our datasets.”

  • Adaku Uchendu

    “Disinformation erodes democracy, and LLMs, while impressive will accelerate such erosion. Thus, to preserve the authenticity of our information ecosystem (which is already somewhat compromised), we need your participation in this project to build models that can accurately distinguish artificial texts from human-written ones.”

    MIT Lincoln Laboratory

  • Jason Lucas

    “GenAI offers significant benefits in enhancing education and healthcare but carries risks. Its potential for unintended, dishonest, and malicious use threatens trust and societal integrity. Its capabilities exceed traditional methods, posing a risk of psycho-social manipulation in crucial events and vulnerable regions. Adopting GenAI requires a commitment to ensure its impact is positive and responsibly managed.“

    Penn State University

  • Shaurya Rohatgi

    “In an era where misinformation and AI-generated texts threaten the very integrity of our information landscape, your support becomes indispensable. Join our mission to protect democracy and maintain the integrity of information. Your role is crucial in developing systems to identify AI-generated texts, ensuring credibility and authenticity.”

    AllSci

  • Saranya Venkatraman

    “Credible and safer tools can only be built on a strong foundation of authentic data that you can help us curate! Your input will be the difference between unhinged and safe AI-based solutions.”

    Penn State University

  • Vladislav Mikhailov

    “The capabilities of generative AI are impressive but can be used for malicious purposes. We invite you to contribute to creating a benchmark of human-edited generated texts to address practical implications of detecting generated text, such as contaminating generated texts in publicly available datasets and identifying AI-assisted responses on crowdsourcing platforms.”

    University of Oslo

  • Natalia Fedorova

    “Have you ever tried telling apart AI-generated text from human-written content? It is not always easy. As technology progresses, it is crucial to enhance both our writing abilities and our capacity to identify AI-generated text. Let's collaborate to refine our benchmarks. Your contribution can make a significant difference!”

  • Jooyoung Lee

    “Amidst the proliferation of AI-generated content, it is critical to distinguish between artificial and human-authored text. Your contribution will help us build automated systems essential for gauging credibility and upholding the integrity of information sharing!”

    Penn State University

  • Ekaterina Artemova

    “At Toloka, we value the quality of the text that we produce for our community. We need your help to avoid AI-generated texts contaminating our datasets.”

  • Adaku Uchendu

    “Disinformation erodes democracy, and LLMs, while impressive will accelerate such erosion. Thus, to preserve the authenticity of our information ecosystem (which is already somewhat compromised), we need your participation in this project to build models that can accurately distinguish artificial texts from human-written ones.”

    MIT Lincoln Laboratory

  • Jason Lucas

    “GenAI offers significant benefits in enhancing education and healthcare but carries risks. Its potential for unintended, dishonest, and malicious use threatens trust and societal integrity. Its capabilities exceed traditional methods, posing a risk of psycho-social manipulation in crucial events and vulnerable regions. Adopting GenAI requires a commitment to ensure its impact is positive and responsibly managed.“

    Penn State University

  • Shaurya Rohatgi

    “In an era where misinformation and AI-generated texts threaten the very integrity of our information landscape, your support becomes indispensable. Join our mission to protect democracy and maintain the integrity of information. Your role is crucial in developing systems to identify AI-generated texts, ensuring credibility and authenticity.”

    AllSci

  • Saranya Venkatraman

    “Credible and safer tools can only be built on a strong foundation of authentic data that you can help us curate! Your input will be the difference between unhinged and safe AI-based solutions.”

    Penn State University

  • Vladislav Mikhailov

    “The capabilities of generative AI are impressive but can be used for malicious purposes. We invite you to contribute to creating a benchmark of human-edited generated texts to address practical implications of detecting generated text, such as contaminating generated texts in publicly available datasets and identifying AI-assisted responses on crowdsourcing platforms.”

    University of Oslo

  • Natalia Fedorova

    “Have you ever tried telling apart AI-generated text from human-written content? It is not always easy. As technology progresses, it is crucial to enhance both our writing abilities and our capacity to identify AI-generated text. Let's collaborate to refine our benchmarks. Your contribution can make a significant difference!”

  • Jooyoung Lee

    “Amidst the proliferation of AI-generated content, it is critical to distinguish between artificial and human-authored text. Your contribution will help us build automated systems essential for gauging credibility and upholding the integrity of information sharing!”

    Penn State University

  • Ekaterina Artemova

    “At Toloka, we value the quality of the text that we produce for our community. We need your help to avoid AI-generated texts contaminating our datasets.”

  • Adaku Uchendu

    “Disinformation erodes democracy, and LLMs, while impressive will accelerate such erosion. Thus, to preserve the authenticity of our information ecosystem (which is already somewhat compromised), we need your participation in this project to build models that can accurately distinguish artificial texts from human-written ones.”

    MIT Lincoln Laboratory

  • Jason Lucas

    “GenAI offers significant benefits in enhancing education and healthcare but carries risks. Its potential for unintended, dishonest, and malicious use threatens trust and societal integrity. Its capabilities exceed traditional methods, posing a risk of psycho-social manipulation in crucial events and vulnerable regions. Adopting GenAI requires a commitment to ensure its impact is positive and responsibly managed.“

    Penn State University

  • Shaurya Rohatgi

    “In an era where misinformation and AI-generated texts threaten the very integrity of our information landscape, your support becomes indispensable. Join our mission to protect democracy and maintain the integrity of information. Your role is crucial in developing systems to identify AI-generated texts, ensuring credibility and authenticity.”

    AllSci

  • Saranya Venkatraman

    “Credible and safer tools can only be built on a strong foundation of authentic data that you can help us curate! Your input will be the difference between unhinged and safe AI-based solutions.”

    Penn State University

  • Vladislav Mikhailov

    “The capabilities of generative AI are impressive but can be used for malicious purposes. We invite you to contribute to creating a benchmark of human-edited generated texts to address practical implications of detecting generated text, such as contaminating generated texts in publicly available datasets and identifying AI-assisted responses on crowdsourcing platforms.”

    University of Oslo

  • Natalia Fedorova

    “Have you ever tried telling apart AI-generated text from human-written content? It is not always easy. As technology progresses, it is crucial to enhance both our writing abilities and our capacity to identify AI-generated text. Let's collaborate to refine our benchmarks. Your contribution can make a significant difference!”

How can I contribute to the benchmark?

1

Make edits

Edit the model's output to make it sound more human — change, remove, or add text to align with your preferences.

2

Submit the new version of the text

3

Access the benchmark when it's ready to use

© 2024 Toloka AI BV