gh-150821: Skip URL parsing in mimetypes.guess_type() for file paths by gaborbernat · Pull Request #150828 · python/cpython

gaborbernat · 2026-06-02T23:29:25Z

mimetypes.guess_type() accepts either a URL or a filesystem path, so it parses its argument as a URL with urllib.parse.urlparse() before looking at the extension. The common argument is a plain file path, which has no URL scheme to find, so the parse — and the urllib.parse import it triggers — is spent on nothing. Guessing content types from file names is everywhere: static-file servers, upload handlers, archive and build tools deciding how to treat each file as they walk a tree of thousands.

A URL scheme requires a :, so a path without one cannot be a URL. This detects that case and goes straight to extension lookup, skipping urlparse() and its lazy import. Real URLs, and the rare path that contains a :, still take the full parsing path, and results are unchanged for both.

Guessing types for 15 real file names sampled from the top-1000 corpus improves from 23.4 µs to 11.0 µs, 112% faster.

Benchmark	base	patched
guess_type x15 file paths	23.4 µs	11.0 µs: 112% faster

Benchmark (pyperf)

Run base vs patched by swapping Lib/mimetypes.py on the same interpreter. The names are real file names sampled from the top-1000 corpus.

import mimetypes, pyperf
mimetypes.init()

names = ["webhook_list.py", "tox.ini", "api_management_delete_policy.py",
    ".env.sample.entra-id", "alerts_get_by_id.py", "ai_prompt_workflow.md",
    "functions.py", "sample_connections.py", "certificate_delete.py",
    "_ai_agents_instrumentor.py", ".flake8", "agent_trace_configurator.py",
    "test_ws_invoke.py", "README.md", "setup.cfg"]

runner = pyperf.Runner()
runner.bench_func("guess_type x15 file paths",
                  lambda: [mimetypes.guess_type(n) for n in names])

Resolves #150821.

…paths guess_type() parsed every argument as a URL before checking the extension, even for plain file paths that have no scheme. Detect the no-scheme case and go straight to extension lookup, avoiding urlparse() and its lazy import. Real URLs keep the full parsing path; results are unchanged.

gaborbernat requested a review from a team as a code owner June 2, 2026 23:29

bedevere-app Bot mentioned this pull request Jun 2, 2026

Speed up mimetypes.guess_type() for plain file paths #150821

Open

bedevere-app Bot added the awaiting review label Jun 2, 2026

gaborbernat force-pushed the opt/mimetypes-skip-urlparse branch from d35cf04 to 175764e Compare June 2, 2026 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-150821: Skip URL parsing in mimetypes.guess_type() for file paths#150828

gh-150821: Skip URL parsing in mimetypes.guess_type() for file paths#150828
gaborbernat wants to merge 1 commit into
python:mainfrom
gaborbernat:opt/mimetypes-skip-urlparse

gaborbernat commented Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gaborbernat commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gaborbernat commented Jun 2, 2026 •

edited

Loading