Microsoft has deleted a blog post that faced criticism for encouraging developers to use pirated Harry Potter books for training AI models. The post, authored by a senior product manager, promoted a new Azure feature for integrating generative AI into applications.

The blog suggested using the popular book series to create Q&A systems and generate AI-driven fan fiction. To facilitate this, it linked to a dataset on Kaggle containing all seven Harry Potter books, which was incorrectly marked as public domain. The dataset was subsequently removed after the publication contacted its uploader.
Experts noted that while the author might not have been aware of the copyright terms, using copyrighted material for AI training raises significant legal and ethical questions. Microsoft declined to comment on the matter. The removal of the blog is seen as a prudent step amid ongoing lawsuits concerning AI models trained on pirated content.