Proof of Contribution

Proof of Contribution

The SixGPT validation process is open sourced and can be freely browsed and forked here: sixgpt-validation (opens in a new tab)

Authenticity

The SixGPT validator will randomly sample a subset of data referenced in the data file. Then, the content will be retreived via a key present in the data file, and the data retreived will be matched against the data in the file.

Ownership

Due to the synthetic nature of the data, ownership scores are not applicable.

Quality

The SixGPT validator will randomly sample a subset of data referenced in the data file. The question in each selected example will be passed to an LLM, which will evaluate how good the question is given the provided context. Then, the answer will be passed to the LLM, which will determine how good the answer is given the question and the given context.

Uniqueness

Duplicate contexts are penalized. Furthermore, duplicate or near-duplicate questions + and answers are penalized, based on an embedding of both types.