Google will now use data including device usage, search history, voice information, or location to train AI products including Bard.
Though your public data is always public (not a big surprise), it being now used for training Google’s AI models is a significant step. By continuing to use Google’s products and services (and agreeing to the policy), you’re making all of your data useful for Google’s AI training. This includes, most notably:
- App, browser, and device usage
- Search queries, YouTube history, ad clicks, voice information, and purchase activity
- GPS, location, and IP address
If you can’t make the switch to privacy-oriented services such as DuckDuckGo for search or Firefox for browsing, it might be prudent to review what Google knows about you. You can go to your account settings and navigate to privacy control for options. Additionally, you can find alternatives to 15 Google products on this list from Restore Privacy.
The debate against companies using vast amounts of private data to train their language models is ever-growing. This includes a class action lawsuit against OpenAI that attacks the ChatGPT tool for being trained on “massive amounts of personal data from the internet” without any notice, consent, or “just compensation” at an unprecedented scale to “develop a volatile, untested technology” that puts everyone at “incalculable risk.”
In related news, Twitter CEO Elon Musk said that the platform was getting data pillaged a lot to affect the service for normal users. To take a step in this direction and combat AI scraping the platform, Twitter announced limits on how many tweets can be accessed by accounts. Generative AI models especially want to scrape human-generated content and Twitter and Reddit are two of the most important resources in that regard.
Reddit has also rolled out charges for its API use which has been a drama of another kind, forcing the Apollo for Reddit app to shut down among other issues.