Wikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per ...
Missing: مجله آیت البرز? q= MaralGPT/ persian- default/
People also ask
What is the Huggingface dataset?
Hugging Face Datasets is a library developed by Hugging Face, an enterprise focused on natural language processing (NLP) technologies. Hugging Face Datasets provides a collection of pre-processed and ready-to-use datasets for various NLP, computer vision, and audio tasks.
What is hugging face used for?
Hugging Face is a machine learning (ML) and data science platform and community that helps users build, deploy and train machine learning models. It provides the infrastructure to demo, run and deploy artificial intelligence (AI) in live applications.
How do I download a dataset from Huggingface?
Go to datasets and search the datasets that you want to download. Go to files and versions and there you can find the required data files.
Is huggingface safe?
Data Security/Privacy Hugging Face does not store any customer data in terms of payloads or tokens that are passed to the Inference Endpoint. We are storing logs for 30 days. Every Inference Endpoints uses TLS/SSL to encrypt the data in transit. We also recommend using AWS or Azure Private Link for organizations.
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Missing: مجله آیت البرز? q= MaralGPT/ persian- default/
Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model.
Missing: مجله آیت البرز? q= MaralGPT/ persian- wikipedia/ default/
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Missing: مجله آیت البرز? q= MaralGPT/ persian- wikipedia/ default/
Dataset Summary. Wiki Question Answering corpus from Microsoft. The WikiQA corpus is a publicly available set of question and sentence pairs, collected and ...
Missing: مجله آیت البرز? q= MaralGPT/ persian-
The dataset is built from the Wikipedia dumps (https://dumps.wikimedia.org/) with one subset per language, each containing a single train split. Each ...
Missing: مجله آیت البرز? MaralGPT/ persian- default/
The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.
Missing: مجله آیت البرز? q= MaralGPT/ persian- default/
wikipedia persons masked: A filtered version of the wikipedia dataset, with only pages of people. Dataset Summary. Contains ~70k pages from wikipedia, ...
Missing: مجله آیت البرز? q= MaralGPT/ persian-
In order to show you the most relevant results, we have omitted some entries very similar to the 8 already displayed. If you like, you can repeat the search with the omitted results included.