Interactive web application · 2026
PokéSound
A web app that creates layered ambient soundscapes for every Pokémon.

PokéSound — Case Study
Overview
PokéSound is a generative web app that creates unique, layered ambient soundscapes for every Pokémon. Users search for any Pokémon, or take a short sound quiz to discover their “Spirit Pokémon” — the one whose sonic profile matches their listening preferences.
The core idea: instead of mapping Pokémon to existing playlists, synthesize a living audio portrait from their data. A Charizard sounds like crackling fire, volcanic rumble, hot wind, and distant roaring. A Gengar sounds like eerie whispers, creaking floorboards, a low heartbeat, and wind howling through a cave. The same Pokémon sounds slightly different on every visit.
Live features: soundscape generation for all Pokémon, a 7-round Spirit Pokémon quiz, an audio-reactive canvas visualizer, per-layer mixing controls, dynamic OG images for social sharing, and CC attribution for all sourced audio.
Problem
No tool exists that translates Pokémon’s rich attribute data — types, stats, habitat, legendary status — into a meaningful sensory experience. Fan communities are deeply invested in Pokémon lore and personality, but existing apps only surface visual or textual data.
The opportunity: Pokémon already have richly evocative identities. The challenge is building a system that maps that data to sound in a way that feels intentional and emotionally resonant — not random — while staying within the constraints of free APIs and browser audio limitations.
Goals
- Generate a distinct, coherent soundscape for any of the 900+ Pokémon without manual curation
- Make the experience shareable and social-media-ready
- Keep all audio processing client-side using Web Audio APIs (no server-side audio rendering)
- Stay within Freesound’s 2,000 req/day rate limit through aggressive caching
- Work beautifully on mobile — the primary sharing surface
Tech Stack
| Concern | Technology |
|---|---|
| Framework | Next.js (App Router) + TypeScript |
| Styling | Tailwind CSS |
| Audio | Tone.js + Web Audio API |
| Visualization | Canvas API |
| State | Zustand |
| Deployment | Vercel |
| OG Images | @vercel/og |
| Testing | Vitest (unit) + Playwright (E2E) |
Architecture
The Mapping Engine (lib/mapping.ts)
The core system. It translates Pokémon data into Freesound search queries and audio mix parameters.
Layer architecture: Every soundscape has five simultaneous audio layers, each with a distinct sonic role:
| Layer | Role |
|---|---|
| Base | Foundational ambient bed (loops continuously) |
| Texture | Environmental detail (loops continuously) |
| Accent | Characteristic punctuation (triggers every 5–15s, driven by Speed stat) |
| Rhythm | Percussive element (loops or triggers periodically) |
| Atmosphere | High-frequency shimmer, wide stereo (loops continuously) |
| Epic (bonus) | Cinematic drone/choir — Legendary and Mythical Pokémon only |
Type → tag mapping: All 18 Pokémon types map to curated Freesound search tags per layer. Fire maps to ["fire", "crackling", "lava", "furnace"] for its base layer. Ghost maps to ["dark ambient", "cave", "dungeon"]. These are stored in data/type-tags.json and inform the Freesound queries.
Dual-type blending: ~60% of Pokémon have two types. The primary type drives base, texture, and atmosphere. The secondary type drives accent and rhythm. A Water/Psychic Pokémon gets ocean waves as its bed and meditation bells as its accent — a natural-sounding crossover without manual intervention.
Habitat blending: Instead of overriding the type when a habitat is known, the engine blends: the first two type tags and the first two habitat tags are OR’d together. A fire-type Pokémon that lives in a cave searches fire OR crackling OR "cave ambience" OR underground, preserving its type identity while adding environmental specificity.
Stats → mix parameters: All six base stats control how the layers are mixed, not what sounds are chosen:
| Stat | Controls |
|---|---|
| Speed | Playback rate (0.8×–1.3×) — urgent vs. lazy feel |
| Attack | Accent layer volume |
| Sp. Attack | Atmosphere layer intensity + stereo width |
| Defense | Base layer volume — heavier vs. lighter foundation |
| Sp. Defense | Low-pass filter cutoff — brighter vs. darker overall sound |
| HP | Layer density — high HP Pokémon have fuller, denser soundscapes |
The Audio Engine (lib/audio-engine.ts)
Tone.js orchestrates the full Web Audio graph:
Freesound Preview MP3s
│
▼
Tone.Player (per layer, looped)
├── Base → Volume → Panner → Filter → Reverb → Destination
├── Texture → Volume → Panner → ┘
├── Accent → Volume → Panner → ┘ (intermittent, Speed-driven)
├── Rhythm → Volume → Panner → ┘
├── Atmosphere → Volume → Panner → ┘
└── Epic → Volume → Panner → ┘ (Legendary only)
│
Waveform Analyser
FFT Analyser
(feeds canvas visualizer + particles)
The engine exposes a state machine (idle → loading → ready → playing → paused → stopped), per-layer mute/volume controls, a waveform analyser and FFT analyser for visualizations, and full dispose/cleanup to prevent audio node leaks on navigation.
Freesound Proxy (app/api/freesound/route.ts)
All Freesound API calls are proxied through a Next.js API route to keep the API key server-side. The client never touches Freesound directly. Results are cached in-memory for 24 hours per query string, drastically reducing API usage against the 2,000 req/day limit.
The soundscape orchestrator (lib/soundscape.ts) uses a tiered fallback per layer:
- All tags OR’d together +
tag:loopfilter - All tags OR’d together, no loop filter
- Individual tags queried one at a time
This ensures something is always found, even for obscure tag combinations.
Spirit Pokémon Quiz (lib/spirit-quiz.ts)
The reverse flow: instead of picking a Pokémon to hear, users take a 7-round A/B listening quiz. Each round presents two short audio clips and asks which resonates more. Four rounds test type affinity (fire vs. water, psychic vs. dark, etc.), three test stat preference (sparse vs. dense, fast vs. slow, bright vs. muffled).
Choices accumulate into a listener profile — a vector of type affinities and stat weights. At the end, the profile is compared against a pre-computed index of all 151 Gen 1 Pokémon using a scoring algorithm that rewards type and stat alignment. The closest match is revealed as the user’s Spirit Pokémon.
Key Technical Challenges
1. Web Audio autoplay policy
Browsers block audio context creation until a user gesture. The app wraps every play action through Tone.start(), which must be called inside a click/tap handler. The UI reflects this constraint — there’s always an explicit Play button; nothing autoplays.
2. Freesound tag query semantics
Early in development, tags were joined with spaces ("shore lakeshore river bank"), which Freesound treats as AND — requiring all terms to match, often returning zero results. The fix: all tags are joined with OR, and multi-word phrases are quoted ("river bank"). A buildQuery() helper enforces this across all layers.
3. Audio level normalization across clips
Freesound clips are mastered at wildly different levels, creating jarring volume jumps between tracks and especially in the Spirit quiz (where users directly compare two clips). The fix: all audio players in the quiz route through a processing chain of Tone.Player(-6 dB) → Tone.Compressor(threshold: -18, ratio: 4:1) → Tone.Limiter(-3 dB), which closes the gap between loud and quiet sources without pre-processing delays.
4. Non-looping clips in looping layers
Many Freesound clips aren’t tagged as loops and don’t loop cleanly. The loop: true property on Tone.Player causes clicks on clips with DC offset or mismatched start/end points. The engine detects non-loop clips and uses crossfade to smooth the transition at the loop boundary.
5. Stacking context z-index with autocomplete
The search bar’s autocomplete dropdown was being rendered behind the Pokémon card grid because both parent containers shared the same z-10 stacking context. The fix: the search section wrapper was elevated to z-20, giving the dropdown a higher effective stacking context than the grid beneath it.
Features Built
- Soundscape generation for any Pokémon across 18 types, with dual-type blending, habitat blending, and legendary/mythical epic layer
- Audio-reactive canvas visualizer — circular waveform ring drawn from
Tone.Waveformdata, type-colored with glow - FFT-reactive background particles — type-themed floating shapes (flames, bubbles, leaves, bolts) whose size, speed, and glow respond to real-time FFT frequency data
- Per-layer mixer with mute toggles and volume sliders; animated equalizer bars when playing
- Spirit Pokémon quiz — 7 A/B rounds, profile builder, matching algorithm against 151-Pokémon index
- Dynamic OG images via
@vercel/og— 1200×630 cards with Pokémon sprite, name, and type for Twitter/OG sharing - Web Share API integration with clipboard fallback
- CC attribution — collapsible credits panel listing each sound’s title, author, license, and link (legally required for CC-BY/CC-BY-NC)
- Autocomplete search with keyboard navigation and ARIA combobox semantics
- Type-themed soundscape pages — background gradient, particle color, and waveform ring color all driven by the Pokémon’s primary type
Testing
206 total tests across 9 files:
| Suite | Tests | Coverage |
|---|---|---|
mapping.test.ts |
52 | All 18 types, dual-type blending, habitat blending, legendary/mythical, stat edge cases |
audio-engine.test.ts |
42 | State machine, playback, layer control, waveform/FFT analysers, dispose |
spirit-quiz.test.ts |
28 | Round structure, profile building, matching algorithm, edge cases |
freesound.test.ts |
18 | Cache hit/miss/clear, quality filters, error handling, missing API key |
soundscape.test.ts |
10 | Full pipeline for Charizard/Pikachu/Gengar/Mewtwo, partial failure resilience, attribution |
| Playwright E2E (4 files) | 56 | Landing page, soundscape player, Spirit quiz flow, responsive layout — Desktop + Mobile Chrome |
Design Decisions
Dark, atmospheric UI: The app leans into immersion. Each Pokémon page uses a type-derived gradient background (fire → orange-950/red-950, ghost → purple-950/violet-950, etc.) with floating type-themed particles behind the sprite. The palette is designed to feel like a stage, not a data dashboard.
Type-colored card borders: The popular Pokémon grid uses ring-2 borders in each Pokémon’s primary type color (orange for fire, yellow for electric, cyan for ice) to make the grid feel alive and immediately informative before any interaction.
Randomised variety: The soundscape generator fetches 5 Freesound candidates per layer and picks one randomly. The same Pokémon sounds slightly different on every visit — a deliberate design choice that encourages regeneration and repeat use.
Mobile-first layout: The primary sharing surface is social media, where content is consumed on phones. The sprite, controls, and mixer are all sized and spaced for thumbs. Safe-area padding handles notched devices.
What I’d Do Differently
- Pre-compute soundscapes at build time for the original 151 Pokémon and serve them as static JSON. This eliminates Freesound API calls entirely for the most common searches and would make the app viable at scale without a paid API tier.
- Populate the curated sound bank. A curated bank for 8 priority types (
data/curated-sounds.json) was scaffolded and the wiring is in place (soundscape.ts), but the first population run hit Freesound’s rate limit. Completing it would improve soundscape quality and reduce live API dependency. - Add evolution chain crossfades — hearing a Charmander soundscape gradually transform into Charizard’s as the Tone.js players crossfade between them would be a compelling interactive moment.
Outcome
PokéSound demonstrates that structured Pokémon data can be used as a compositional system — the Mapping Engine acts as a deterministic score that Tone.js performs in the browser on every visit. The result is a tool that feels both algorithmic and emotionally resonant, with enough variety to reward repeated use.
Gallery
Gallery coming soon
More visuals from this project are on the way.