New Clearview AI Decision has Implications for OpenAI Investigation
Note: This post also appears on my blog at http://www.teresascassa.ca
The Alberta Court of Queen’s Bench has issued a decision in Clearview AI’s application for judicial of an Order made by the province’s privacy commissioner. The Commissioner had ordered Clearview AI to take certain steps following a finding that the company had breached Alberta’s Personal Information Protection Act (PIPA) when it scraped billions of images – including those of Albertans – from the internet to create a massive facial recognition database marketed to police services around the world. The court’s decision is a partial victory for the commissioner. It is interesting and important for several reasons – including for its relevance to generative AI systems and the ongoing joint privacy investigation into OpenAI. These issues are outlined below.
Brief Background
Clearview AI became notorious in 2020 following a New York Times article which broke the story on the company’s activities. Data protection commissioners in Europe and elsewhere launched investigations, which overwhelmingly concluded that the company violated applicable data protection laws. In Canada, the federal privacy commissioner joined forces with the Quebec, Alberta and British Columbia (BC) commissioners, each of which have private sector jurisdiction. Their joint investigation report concluded that their respective laws applied to Clearview AI’s activities as there was a real and substantial connection to their jurisdictions. They found that Clearview collected, used and disclosed personal information without consent, and that no exceptions to consent applied. The key exception advanced by Clearview AI was the exception for “publicly available information”. The Commissioners found that the scope of this exception, which was similarly worded in the federal, Alberta and BC laws, required a narrow interpretation and that the definition in the regulations enacted under each of these laws did not include information published on the internet. The commissioners also found that, contrary to shared legislative requirements, the collection and use of the personal information by Clearview AI was not for a purpose that a reasonable person would consider appropriate in the circumstances. The report of findings made a number of recommendations that Clearview ultimately did not accept. The Quebec, BC and Alberta commissioners all have order making powers (which the federal commissioner does not). Each of these commissioners ordered Clearview to correct its practices, and Clearview sought judicial review of each of these orders. The decision of the BC Supreme Court (which upheld the Commissioner’s order) is discussed in an earlier post. The decision from Quebec has yet to be issued.
In Alberta, Clearview AI challenged the commissioner’s jurisdiction on the basis that Alberta’s PIPA did not apply to its activities. It also argued that that the Commissioner’s interpretation of “publicly available information” was unreasonable. In the alternative, Clearview AI argued that ‘publicly available information’, as interpreted by the Commissioner, was an unconstitutional violation of its freedom of expression. It also contested the Commissioner’s finding that Clearview did not have a reasonable purpose for collecting, using and disclosing the personal information.
The Jurisdictional Question
Courts have established that Canadian data protection laws will apply where there is a real and substantial connection to the relevant jurisdiction. Clearview AI argued that it was a US-based company that scraped most of its data from social media websites mainly hosted outside of Canada, and that therefore its activities took place outside of Canada and its provinces. Yet, as Justice Feasby noted, “[s]trict adherence to the traditional territorial conception of jurisdiction would make protecting privacy interests impossible when information may be located everywhere and nowhere at once” (at para 50). He noted that there was no evidence regarding the actual location of the servers of social media platforms, and that Clearview AI’s scraping activities went beyond social media platforms. Justice Feasby ruled that he was entitled to infer from available evidence that images of Albertans were collected from servers located in Canada and in Alberta. He observed that in any event, Clearview marketed its services to police in Alberta, and its voluntary decision to cease offering those services did not alter the fact that it had been doing business in Alberta and could do so again. Further, the information at issue in the order was personal information of Albertans. All of this gave rise to a real and substantial connection with Alberta.
Publicly Available Information
The federal Personal Information Protection and Electronic Documents Act (PIPEDA) contains an exception to the consent requirement for “publicly available information”. The meaning of this term is defined in the Regulations Specifying Publicly Available Information. The relevant category is found in s. 1(e) which specifies “personal information that appears in a publication, including a magazine, book or newspaper, in printed or electronic form, that is available to the public, where the individual has provided the information.” Alberta’s PIPA contains a similar exception (as does BC’s law), although the wording is slightly different. Section 7(e) of the Alberta regulations creates an exception to consent where:
(e) the personal information is contained in a publication, including, but not limited to, a magazine, book or newspaper, whether in printed or electronic form, but only if
(i) the publication is available to the public, and
(ii) it is reasonable to assume that the individual that the information is about provided that information; [My emphasis]
In their joint report of findings, the Commissioners found that their respective “publicly available information” exceptions did not include social media platforms.
Clearview AI made much of the wording of Alberta’s exception, arguing that even if it could be said that the PIPEDA language excluded social media platforms, the use of the words “including but not limited to” in the Alberta regulation made it clear that the list was not closed, nor was it limited to the types of publications referenced.
In interpreting the exceptions for publicly available information, the Commissioners emphasized the quasi-constitutional nature of privacy legislation. They found that the privacy rights should receive a broad and expansive interpretation and the exceptions to those rights should be interpreted narrowly. The commissioners also found significant differences between social media platforms and the more conventional types of publications referenced in their respective regulations, making it inappropriate to broaden the exception. Justice Feasby, applying reasonableness as the appropriate standard of review, found that the Alberta Commissioner’s interpretation of the exception was reasonable.
Freedom of Expression
Had the court’s decision ended there, the outcome would have been much the same as the result in the BC Supreme Court. However, in this case, Clearview AI also challenged the constitutionality of the regulations. It sought a declaration that if the exception were interpreted as limited to books, magazines and comparable publications, then this violated its freedom of expression under s. 2(b) of the Canadian Charter of Rights and Freedoms.
Clearview AI argued that its commercial purposes of scraping the internet to provide information services to its clients was expressive and was therefore protected speech. Justice Feasby noted that Clearview’s collection of internet-based information was bot-driven and not engaged in by humans. Nevertheless, he found that “scraping the internet with a bot to gather images and information may be protected by s. 2(b) when it is part of a process that leads to the conveyance of meaning” (at para 104).
Interestingly, Justice Feasby noted that since Clearview no longer offered its services in Canada, any expressive activities took place outside of Canada, and thus were arguably not protected by the Charter. However, he acknowledged that the services had at one point been offered in Canada and could be again. He observed that “until Clearview removes itself permanently from Alberta, I must find that its expression in Alberta is restricted by PIPA and the PIPA Regulation” (at para 106).
Having found a prima facie breach of s. 2(b), Justice Feasby considered whether this was a reasonable limit demonstrably justified in a free and democratic society, under s. 1 of the Charter. The Commissioner argued that the expression at issue in this case was commercial in nature and thus of lesser value. Justice Feasby was not persuaded by category-based assumptions of value; rather, he preferred an approach in which the regulation of commercial expression is consistent with and proportionate to its character.
Justice Feasby found that the Commissioner’s reasonable interpretation of the exception in s. 7 of the regulations meant that it would exclude social media platforms or “other kinds of internet websites where images and personal information may be found” (at para 118). He noted that this is a source-based exception – in other words that some publicly available information may be used without knowledge or consent, but not other similar information. The exclusion depends on the source and not the purpose of use for the personal information. Justice Feasby expressed concern that the same exception that would exclude the scraping of images from the internet for the creation of a facial recognition database would also apply to the activities of search engines widely used by individuals to gain access to information on the internet. He thus found that the publicly available information exception was overbroad, stating: “Without a reasonable exception to the consent requirement for personal information made publicly available on the internet without use of privacy settings, internet search service providers are subject to a mandatory consent requirement when they collect, use and disclose such personal information by indexing and delivering search results” (at para 138). He stated: “I take judicial notice of the fact that search engines like Google are an important (and perhaps the most important) way individuals access information on the internet” (at para 144).
Justice Feasby also noted that while it was important to give individuals some level of control over their personal information, “it must also be recognized that some individuals make conscious choices to make their images and information discoverable by search engines and that they have the tools in the form of privacy settings to prevent the collection, use, and disclosure of their personal information” (at para 143). His constitutional remedy – to strike the words “including, but not limited to magazines, books, and newspapers” from the regulation was designed to allow “the word ‘publication’ to take its ordinary meaning which I characterize as ‘something that has been intentionally made public’” (at para 149).
The Belt and Suspenders Approach
Although excising part of the publicly available information definition seems like a major victory for Clearview AI, in practical terms it is not. This is because of what the court refers to as the law’s “belt and suspenders approach”. This metaphor suggests that there are two routes to keep up privacy’s pants – and loosening the belt does not remove the suspenders. In this case, the suspenders are located in the clause found in PIPA, as well as in its federal and BC counterparts, that limits the collection, use and disclosure of personal information to only that which “a reasonable person would consider appropriate in the circumstances”. The court ruled that the Commissioner’s conclusion that the scraping of personal information was not for purposes that a reasonable person would consider appropriate in the circumstances was reasonable and should not be overturned. This approach, set out in the joint report of findings, emphasized that the company’s mass data scraping involved over 3 billion images of individuals, including children. It was used to create biometric face prints that would remain in Clearview’s databases even if the source images were removed from the internet, and it was carried out for commercial purposes. The commissioners also found that the purposes were not related to the reasons why individuals might have shared their photographs online, could be used to the detriment of those individuals, and created the potential for a risk of significant harm. Continuing with his analogy to search engines, Justice Feasby noted that Clearview AI’s use of publicly available images was very different from the use of the same images by search engines. The different purposes are essential to the reasonableness determination. Justice Feasby states: “The “purposes that are reasonable” analysis is individualized such that a finding that Clearview’s use of personal information is not for reasonable purposes does not apply to other organizations and does not threaten the operations of the internet” (at para 159). He noted that the commercial dimensions of the use are not determinative of reasonableness. However, he observed that “where images and information are posted to social media for the purpose of sharing with family and friends (or prospective friends), the commercialization of such images and information by another party may be a relevant consideration in determining whether the use is reasonable” (at para 160).
The result is that Clearview AI’s scraping of images from the public internet violates Alberta’s PIPA. The court further ruled that the Commissioner’s order was clear and specific, and capable of being implemented. Justice Feasby required Clearview AI to report within 50 days on its good faith progress in taking steps to cease the collection, use and disclosure of images and biometric data collected from individuals in Alberta, and to delete images and biometric data in its database that are from individuals in Alberta.
Harmonized Approaches to Data Protection Law in Canada
This decision highlights some of the challenges to the growing collaboration and cooperation of privacy commissioners in Canada when it comes to interpreting key terms and concepts in substantially similar legislation. Increasingly, the commissioners engage in joint investigations where complaints involve organizations operating in multiple jurisdictions in Canada. While this occurs primarily in the private sector context, it is not exclusively the case, as a recent joint investigation between the BC and Ontario commissioners into a health data breach demonstrates. Joint investigations conserve regulator resources and save private sector organizations from having to respond to multiple similar and concurrent investigations. In addition, joint investigations can lead to harmonized approaches and interpretations of shared concepts in similar legislation. This is a good thing for creating certainty and consistency for those who do business across Canadian jurisdictions.
However, harmonized approaches are vulnerable to multiple judicial review applications, as was the case following the Clearview AI investigation. Although the BC Supreme Court found that the BC Commissioner’s order was reasonable, what the Alberta King’s Bench decision demonstrates is that a common front can be fractured. Justice Feasby found that a slight difference in wording between Alberta’s regulations and those in BC and at the federal level was sufficient to justify finding the scope of Alberta’s publicly available information exception to be unconstitutional.
Harmonized approaches may also be vulnerable to unilateral legislative change. In this respect, it is worth noting that an Alberta report on the impending reform of PIPA recommends “that the Government take all necessary steps, including through proposing amendments to the Personal Information Protection Act, to improve alignment of all provincial privacy legislation, including in the private, public and health sectors” (at p. 13).
The Elephant in the Room: Generative AI and Data Protection Law in Canada
In his reasons, Justice Feasby made Google’s search functions a running comparison for Clearview AI’s data scraping practices. Perhaps a better example would have been the data scraping that takes place in order to train generative AI models. However, the court may have avoided that example because there is an ongoing investigation by the Alberta, Quebec, BC and federal commissioners into OpenAI’s practices. The findings in that investigation are overdue – perhaps the delay has, at least in part, been caused by anticipation of what might happen with the Alberta Clearview AI judicial review. The Alberta decision will likely present a conundrum for the commissioners.
Reading between the lines of Justice Feasby’s decision, it is entirely possible that he would find that the scraping of the public internet to gather training data for generative AI systems would both fall within the exception for publicly available information and be for a purpose that a reasonable person would consider appropriate in the circumstances. Generative AI tools are now widely used – more widely even than search engines since these tools are now also embedded in search engines themselves. To find that the collection and use of personal information that may be indiscriminately found on the internet cannot be used in this way because consent is required is fundamentally impractical. In the EU, the legitimate interest exception in the GDPR provides latitude for use in this way without consent, and recent guidance from the European Data Protection Supervisor suggestions that legitimate interests combined, where appropriate with Data Protection Impact Assessments may address key data protection issues.
In this sense, the approach taken by Justice Feasby seems to carve a path for data protection in a GenAI era in Canada by allowing data scraping of publicly available sources on the Internet in principle, subject to the limit that any such collection or any ensuing use or disclosure of the personal information must be for purposes that a reasonable person would consider appropriate in the circumstances. However, this is not a perfect solution. In the first place, unlike the EU approach, which ensures that other privacy protective measures (such as privacy impact assessments) govern this kind of mass collection, Canadian law remains outdated and inadequate. Further, the publicly available information exceptions – including Alberta’s even after its constitutional nip and tuck – also emphasize that, to use the language of Alberta’s PIPA, it must be “reasonable to assume that the individual that the information is about provided the information”. In fact, there will be many circumstances in which individuals have not provided the information posted online about them. This is the case with photos from parties, family events and other social interactions. Further, social media – and the internet as a whole – is full of non-consensual images, gossip, anecdotes and accusations.
The solution crafted by the Alberta Court of King’s Bench is therefore only a partial solution. A legitimate interest exception would likely serve much better in these circumstances, particularly if it is combined with broader governance obligations to ensure that privacy is adequately considered and assessed. Of course, before this happens, the federal government’s privacy reform measures in Bill C-27 must be resuscitated in some form or another.