The Google Arts & Culture "Say What You See" Game: How to Play
Google Arts & Culture has a feature that turns your description of a painting into an AI-powered guessing game, and it's more interesting than it sounds.

What is "Say What You See" on Google Arts & Culture?
"Say What You See" is an experiment inside the Google Arts & Culture app where you type what you see in a painting and the app's AI tries to match your words to the actual artwork. It sounds simple. It gets surprisingly strange.
The basic loop goes like this: the app shows you a detail or a full view of a famous artwork, you type a description using plain language, and Google's image-understanding model scores how well your words match its own "understanding" of the image. The closer your description is to how the AI reads the painting, the better your score.
It's one of several playful experiments Google has bundled into the Arts & Culture app alongside things like the art selfie face-match and the color palette tool. But Say What You See stands out because it actually teaches you something about how computers interpret visual art, and that's worth paying attention to.
How does Say What You See actually work?
Under the hood, Google is running a vision model that analyzes the artwork and generates a set of descriptive tags or embeddings. When you type your description, the app compares your words to those tags using natural language processing. High overlap means a high score.
This matters because the model doesn't "see" the way humans do. It pattern-matches. It's been trained on enormous datasets of images and their associated text, so it's very good at identifying objects, colors, compositions, and sometimes even emotional tone. But it can miss symbolic content entirely. A woman holding a skull in a 17th-century vanitas painting might get tagged as "person" and "object" without any sense of what the skull means culturally.
Google's own documentation on the Experiments with Google platform describes these Art & Culture features as ways to "explore art in unexpected ways," which is accurate, but undersells the genuine strangeness of watching a machine describe a Vermeer.
What the AI is actually detecting
The model tends to score highest on descriptions that include:
- Dominant colors ("blue," "golden," "dark background")
- Recognizable subjects ("woman," "horse," "landscape," "crowd")
- Composition cues ("figure on the left," "bright center," "dark edges")
- Texture and material words ("oil paint," "rough brushwork," "smooth skin")
Abstract concepts score poorly. Writing "melancholy" or "divine light" might feel accurate to you but the AI doesn't reward it reliably. The sweet spot is concrete visual description, which is actually a useful writing exercise in its own right.
How to use Google Arts & Culture step by step
If you've never opened the app, here's the short version. Download Google Arts & Culture from the App Store or Google Play. The web version at artsandculture.google.com works fine too, though a few experiments are app-only. Once you're in:
- Tap the grid icon or search "Say What You See" in the search bar.
- You can also find it under the "Play" section or browse the Experiments tab.
- Once the game loads, an artwork appears. Type what you see in the text field.
- Hit submit. The app scores your description and shows you how it matched up.
- You get a new artwork and repeat.
The app also has a guided version that gives you prompts, which works well for kids or for classrooms. Teachers have used it to get students looking closely at paintings without the pressure of "saying the right thing." There's no wrong answer exactly, just some answers that happen to match machine vision better than others.
If you want a deeper walkthrough with screenshots and specific tips for getting high scores, our covers the mechanics in more detail.
Can I take a photo and have Google tell me what it is?
Yes, but that's a different tool. Google Lens is what you want for that. It's built into the Google app on both Android and iOS, and you can access it through Google Images on desktop by clicking the camera icon in the search bar.
Point Google Lens at a painting, a sculpture, a poster, or almost any visual object and it will try to identify it. For well-known works it's genuinely impressive. Point it at Van Gogh's "Starry Night" and it will tell you the title, the artist, the year, where it's held (MoMA in New York), and surface related articles.
According to Wikipedia's entry on Google Lens, the tool uses neural networks trained on Google's image index and can identify "over a billion objects." For art specifically, it pulls from the Arts & Culture database, which includes high-resolution images from more than 2,000 partner museums worldwide.
Say What You See is the reverse of Google Lens. Lens takes a photo and gives you words. Say What You See takes a painting, asks for your words, and judges how well you matched what the AI already sees. Same underlying technology, opposite direction.
How to find your look-alike on Google Arts & Culture
This is the Art Selfie feature, and it's separate from Say What You See, though both live in the same app. To use it, open Google Arts & Culture, tap the camera icon at the bottom of the screen, and select "Art Selfie." The app uses your front camera to take a photo of your face and then searches its database of artworks for portraits that share your facial geometry.
Results are often surprising and sometimes hilarious. People have gotten matched to 16th-century court portraits, Egyptian sculptures, and Japanese woodblock prints. Colossal, the art and design publication that covers digital culture closely, has featured numerous examples of striking Art Selfie matches that went viral precisely because the resemblances felt uncanny rather than algorithmic.
One practical note: the Art Selfie feature isn't available in all countries due to local facial recognition regulations. In the US, Illinois residents have historically seen it blocked because of the state's Biometric Information Privacy Act. If you don't see the camera option, that's likely why.
Say What You See as a game vs. Say What You See as a learning tool
Most people encounter this as a casual five-minute game, which is fine. But there's a real educational layer here that doesn't get talked about enough.
Describing a painting precisely is hard. Really hard. Art historians spend careers developing the vocabulary. When you play Say What You See and your description of "a man looking sad next to a table" scores poorly while "seated male figure, dark clothing, candle, shadowed background" scores well, you're learning something about formal visual analysis without anyone lecturing you about it.
The exercise maps closely to what art educators call "close looking," a practice where students describe only what they can literally observe before moving to interpretation. The Museum of Modern Art's learning resources describe close looking as foundational to understanding contemporary art, noting that slowing down to describe what's actually there changes how deeply you engage with a work.
Say What You See gamifies this. The AI scoring isn't perfect, but the pressure to find words that the machine will recognize pushes you toward specificity. You stop writing poetry and start writing inventory, which turns out to be its own skill.
Tips for getting better scores
A few things that consistently help:
- Start with the most dominant element. If the sky takes up 60% of the canvas, mention it first.
- Use color names. "Cobalt blue" works better than "deep blue" in most cases.
- Count figures. "Three women" outscores "some women."
- Name objects specifically. "Lute" beats "musical instrument." "Goblet" beats "cup."
- Note the relationship between elements. "Figure facing left toward a window" gives the model orientation cues.
Avoid pure emotion words as your primary description. They can supplement a good concrete description but won't carry you on their own.
What Say What You See reveals about AI and art
There's something genuinely revealing about playing this game for half an hour. The AI is very good at the things humans often skip when they look at art (exact colors, object counts, spatial layout) and very bad at the things humans immediately reach for (meaning, mood, narrative, symbolism).
A painting like Goya's "Saturn Devouring His Son" is formally easy for a vision model: large figure, dark background, grotesque subject, specific body parts. But nothing in the machine's description captures why that painting is disturbing, or what Saturn represents, or what Goya might have been saying about power and violence. The Museo del Prado's entry on the work discusses its psychological intensity at length. A vision model has none of that context.
This gap between visual description and visual understanding is the central problem in machine vision research right now. Say What You See lets ordinary people feel that gap directly, which is more effective than reading a paper about it.
If you find yourself getting interested in how human artists think about what they see (rather than how machines do), the ideas in You're More of an Artist Than You Think connect directly to why close observation matters as a creative practice, not just as a game mechanic.
The "challenge" versions and social sharing
Say What You See has developed an informal challenge culture online. People screenshot their scores, compare results, and post examples of descriptions that either nailed a painting or spectacularly failed. The failure cases are the most entertaining: someone writes a beautifully poetic paragraph about a Monet and scores 12%, while "blue water, bridge, green plants, reflection" scores 89%.
There's also a competitive angle if you use the timed version. The app gives you a set number of seconds to describe as much as you can before scoring, which forces you to prioritize. Experienced players learn to front-load the most visually prominent elements immediately rather than constructing complete sentences.
Some teachers have run classroom competitions using projected artworks, with students typing simultaneously on their own devices. It gets loud. It also, consistently, generates better attention to paintings than any lecture about the same works.
A quick note on the broader Arts & Culture app
Say What You See is one experiment inside an app that now has dozens of them. Virtual museum tours, color-matching games, cultural heritage timelines, a tool that finds the nearest museum to your location. Google has been building this platform since 2011, when it launched as the Art Project with 17 museum partners. As of 2024 it works with over 2,000 cultural institutions across 80 countries.
The scale means the artwork database you're playing against in Say What You See includes everything from the Uffizi Gallery's Botticellis to Aboriginal Australian art to ancient Persian manuscripts. When the game serves you something unfamiliar, that's actually the most valuable moment. The AI is seeing it the same way it sees everything else. You might be the one with more context.