Elofyn / Tools / Text Case Converter / About

About the Text Case Converter

Seven cases, one paste. A tour of where these conventions come from, how the tool tokenises arbitrary text into clean identifiers, and what it deliberately does not try to do. Back to the tool ↗

Why a tool that does just this exists

Every working day touches naming conventions. A Slack message becomes a Linear ticket becomes a function name becomes a database column becomes a CSV header becomes a doc title. Each of those layers wants the same words, but in a different case. Doing it by hand is fast for one word, tedious for fifty, and error-prone in between. The bug where one column out of forty is user_id and the rest are userId is the kind of small-but-real friction this tool is for. Paste once, read all seven outputs at a glance, copy the one you need. No server hop, no signup, no waiting on a parse-then-format round-trip — every transform is closed-form per-character work that finishes before your finger leaves the key.

What “case” means as a written-language concept

Latin script is bicameral: it has two parallel sets of letterforms, majuscule (capitals) and minuscule (lowercase). That was not always so. The capitals descend from the inscribed monumental scripts of Imperial Rome — letterforms cut into stone for permanence. The minuscule shapes are a medieval invention, standardised in the Carolingian reforms of the late 8th century as a faster, more legible script for hand-copied manuscripts. The two systems then merged into the modern alphabet, where every letter has an UPPER and a lower form and writers choose between them by convention.

Modern English uses four prose-case conventions that this tool implements. UPPER CASE is for shouting, headlines, and acronyms. lower case is the unmarked default of running text. Title Case capitalises every word, and is the rule used on book covers, on menu items, and in the simplest UI labels. Sentence case capitalises only the first word of each sentence (plus proper nouns, which the tool does not try to detect). One subtlety worth naming: “Title Case” is actually a family of conflicting rules. The Chicago Manual of Style, the AP Stylebook, and MLA all disagree about which short prepositions, articles, and conjunctions should remain lowercase (“the”, “of”, “a”, “and”, and so on). This tool implements the simple “every word capitalised” rule — see “What the tool deliberately does not do” below for why.

What “case” means as a programming convention

Programmers inherited the bicameral alphabet and used it to encode meaning into identifiers. The history is varied. Early FORTRAN was UPPERCASE-only — punched cards did not encode lowercase. Lisp grew up case-insensitive and gave us hyphen-separated names like call-with-current-continuation, the spiritual ancestor of modern kebab-case. Smalltalk invented camelCase in the 1970s; Java codified it through the standard library, and JavaScript, C#, and modern frontend code all inherited the convention. Python and Ruby went the other way: PEP 8 and the community Ruby style guide both standardised on snake_case for variables and function names. Go layered something more semantic on top: an identifier is exported from its package if its first letter is uppercase (PascalCase = exported, camelCase = unexported), so the case choice carries access-modifier meaning.

URL slugs, npm package names, and a lot of CSS use kebab-case because URLs treat dashes and underscores differently and browsers, historically, were friendlier to dashes. SQL columns traditionally use snake_case because most databases are case-insensitive on unquoted identifiers, and dashes inside a name conflict with the SQL minus operator. None of this is arbitrary — each convention solves a problem its host system actually has. The reason a tool like this is useful is that nobody works in just one host system at a time. A typical change might cross JavaScript (camel), Python (snake), a CSS file (kebab), and a SQL migration (snake again, but with a twist of CAPITAL CONSTANT_CASE if the field is an enum value) in a single afternoon.

Word boundaries: the surprisingly hard part

Converting case is the easy half. The hard half is deciding where the word boundaries are in the input. Naive splits get you into trouble fast. HTTPRequest should produce two words — HTTP and Request, not three of unequal length — userProfileCache should produce three words, user_profile_cache should produce the same three, and User Profile Cache should converge on the same three again. To make that work, the tokeniser layers code-identifier rules on top of the language-aware word boundaries from the Unicode standard (UAX #29). Specifically: whitespace, plus the punctuation characters _, -, ., /, \, and :, are word boundaries with the separator dropped. A lowercase-to-uppercase transition is a boundary (the camel hump). A run of uppercase followed by an uppercase-then-lower pair is split before the final uppercase letter (the acronym hump: HTTPRequest → HTTP + Request). Alphabetic-to-digit and digit-to-alphabetic transitions are boundaries too (v2model splits into v, 2, model). These are the same rules the change-case library uses by default, which is why this tool uses it directly: the canonical implementation has been in production for years and we do not need a second one.

The seven cases the tool emits

For the input the quick brown fox, the seven outputs are: THE QUICK BROWN FOX (UPPER), the quick brown fox (lower), The Quick Brown Fox (Title), The quick brown fox (Sentence), theQuickBrownFox (camel), the_quick_brown_fox (snake), the-quick-brown-fox (kebab). For the input userProfileCache the identifier outputs converge: user_profile_cache, user-profile-cache, and User Profile Cache in Title. For THE QUICK BROWN FOX. THE LAZY DOG., Sentence yields The quick brown fox. The lazy dog. — first letter after a sentence terminator promoted, rest lowercased. Newlines in the input are preserved verbatim in every output cell; each line is tokenised independently for the identifier cells, so a two-line input becomes two newline- separated identifiers, not one joined string.

What the tool deliberately does not do

A few capabilities are intentionally out of scope. Headline-style Title Case (the Chicago / AP / MLA rule that lowercases short prepositions) is omitted because the rules disagree between style guides and we would rather get the simple “every word capitalised” rule exactly right. Acronym preservation in identifier cases — keeping HTTPRequest as HTTPRequest in PascalCase rather than collapsing it to httpRequest — would need a config toggle and a more elaborate parser, which is a v1.1 feature, not a v1 one. There is no locale override: the browser’s default locale governs case operations, so a Turkish-locale user pasting i gets İ (dotted capital), which is the correct Turkish behaviour. There is no Unicode NFC / NFD normalisation, no bulk file ingestion, and no persistent history. The seven cases listed are the v1 contract; PascalCase, CONSTANT_CASE, and dot.case may follow behind a disclosure in a future release if the demand is there.

Common anti-patterns

A few habits that this tool was built to short-circuit. Storing the same value in two cases in two systems and trying to keep them in sync — pick one canonical form, derive the rest at render time. Using toUpperCase() instead of toLocaleUpperCase() in JavaScript — the former silently breaks Turkish and a few other locales. Hand-camelCasing 200 column names from a CSV header row by deleting underscores in your editor — the regex you reach for will trip over an acronym sooner or later. Relying on a do-it-yourself str.split(/[_-]/) tokeniser that handles user_profile but breaks on UserProfile or v2Model. When the canonical word-boundary rules are one library import away, rolling your own is rarely the right move.

References

Davis, M. et al. (current revision). Unicode Standard Annex #29 — Unicode Text Segmentation. The authoritative specification for word boundaries used by the tokeniser, including the rules for grapheme clusters, word breaks, and sentence breaks that the seven case outputs respect.
Wikipedia contributors (perennial). Letter case. The history of bicameral script (majuscule / minuscule), the orthographic conventions of UPPER / lower / Title / Sentence case in modern English, and the existence of multiple conflicting Title Case rules.
Embrey, B. change-case (GitHub). The reference implementation for the identifier-style conversions, including the acronym-hump and number-boundary tokenisation rules described above. The library is MIT- licensed and powers the camelCase, snake_case, and kebab-case outputs of this tool.

Related tools

/tools/slug — Slug generator — turn any sentence into a URL-safe slug, Unicode-aware.
/tools/csv-json — CSV ↔ JSON converter — auto-detect delimiter, header row toggle, all in-browser.
/tools/json-yaml — JSON ↔ YAML converter — live side-by-side preview, no upload.
/tools/markdown-html — Markdown ↔ HTML converter — live preview, round-trip in browser.