Learn more about how Inflection AI.Inc. develops AI models and how we apply them in our products such as our AI chatbot, Pi
Last updated: August 26, 2025
This article provides an overview of the information we collect off our products to develop our AI models. For information on how we collect and use information from users of our products and services, including to develop our models, please see our Privacy Policy.
Large language models (“LLMs”) (such as the model powering Pi, our AI chatbot trained to be a personal AI assistant) are trained on a variety of content (such as text, images and other multimedia) so that they can learn the patterns and connections between different types of words or content. LLMs learn from breaking information down into individual components like words, turning those words into numerical representations on a graph, and developing an understanding of the relationship between words, based on their proximity to each other on that graph.
Fine-tuning LLMs is the process of retraining a pre-trained LLM on a specific task or dataset. This helps the LLM to better understand the nuances and context of the task at hand, which improves its performance and accuracy. This training is important so that the model performs effectively and safely.
We train and fine-tune our LLMs using data from 3 sources:
Publicly available data e.g., publicly available web pages from the internet.
Licensed datasets from third parties including open source datasets i.e., datasets that come with legal permission for us to use, modify, and share them freely.
Synthetically generated data which we (e.g. our human trainers or researchers)generate or create and use for training our models.
It is not our intention to “train” our models on personal information specifically, but given where the datasets used to train our LLMs are sourced from, the datasets may contain personal information. Therefore, our “training data” may incidentally include personal information.
We have no intention to identify or better understand personal information within the “training data”. Inflection AI only uses “training data” to help our LLMs learn about language and how to understand and respond to it. Inflection AI does not use any personal information within “training data” to contact people, build profiles about them, to try to sell or market anything to them, or to sell the information itself to any third party.
We have implemented a number of privacy by design safeguards for our LLM training. For example:
We have a legal basis to use training information as we base our collection and use of personal information that is incidentally included in “training data” on legitimate interests under privacy laws like the GDPR, as explained in more detail in our <Privacy Policy>. We have completed a data protection impact assessment to help ensure we are collecting and using this information legally and responsibly. We have also completed a legitimate interests assessment to ensure data subjects’ rights are appropriately balanced against our legitimate interests when carrying out this training.
When developing and training our LLMs we have adopted strict policies and guidelines for the collection of publicly available information which safeguard against processing data in an intrusive way. For example, we use a trusted third party which classifies and flags URLs as inappropriate or as containing sensitive content and we exclude any content flagged as sensitive from pre-training datasets. We also undertake due diligence on the data that we licence from third parties to ensure that we have the necessary rights, representations, and warranties to enable us to use the data as intended.
Before commencing the development of new LLMs we evaluate potential risks, ethical considerations and obtain stakeholder input in an effort to ensure that every solution adheres to appropriate standards of privacy, fairness, and transparency.
Our models are specifically trained to respect privacy. Output control measures have been put in place designed to protect the privacy of any personal information incidentally included in the “training data” and lower the likelihood of obtaining personal information related to “training data” from queries by users. Inflection AI LLMs are trained to not disclose or repeat personal information of private individuals which may have been incidentally captured in “training data”, even if prompted to do so by a user.
We automatically collect information about your interactions with us or our Services, including:
Our Privacy Policy explains your rights in respect of how we process your personal information. In certain jurisdictions, this includes your right to request a copy of your personal information, and to object to our processing of your personal information or request that it be deleted. We make every effort to respond to such requests. However, please be aware that, in accordance with privacy laws, these rights may not be absolute and we may decline certain requests if we have a lawful reason for doing so. Please reach out to the Inflection AI team at [email protected] if you have any questions about your rights, this Notice or the Privacy Policy.
As further explained in our Privacy Policy, please note that under some countries' laws, you have the right to lodge a complaint with the supervisory authority in the place in which you live or work. A full list of EEA supervisory authorities’ contact details is available here.