Python SDK

The official Python SDK for ByCrawl — typed responses, async support, auto-pagination, and built-in retry logic.

PyPI · GitHub · Python 3.10+

New to ByCrawl? Start with Getting Started to create an account and get your API key.

Installation

Install the SDK


pip install bycrawl

Set your API key

Environment Variable


export BYCRAWL_API_KEY="sk_byc_..."


from bycrawl import ByCrawl
 
client = ByCrawl()  # picks up from BYCRAWL_API_KEY

Never commit your API key to source control. Use environment variables or a secrets manager.

Make your first call


from bycrawl import ByCrawl
 
client = ByCrawl()
 
user = client.threads.get_user("zuck")
print(user.data.username)        # "zuck"
print(user.data.follower_count)  # 3200000
 
post = client.threads.get_post("DQt-ox3kdE4")
print(post.data.text)
print(post.data.stats.likes)

Response Objects

Every SDK method returns an APIResponse[T] object with three parts:


resp = client.threads.get_user("zuck")
 
# 1. Typed data — full autocomplete in your IDE
resp.data.username         # str
resp.data.follower_count   # int
resp.data.is_verified      # bool
 
# 2. Rate limit metadata
resp.rate_limit.remaining  # requests remaining in window
resp.rate_limit.reset      # reset timestamp
 
# 3. Credit usage
resp.credit.remaining      # credits left
resp.credit.used           # credits consumed by this request

The .data field is typed per endpoint, so your IDE provides autocompletion for every field — no need to guess key names or dig through raw JSON.

Platforms

Threads X / Twitter Facebook Instagram Reddit TikTok LinkedIn YouTube Dcard Google Maps 104 Jobs

Platform	Namespace
Threads	`client.threads`
X / Twitter	`client.x`
Facebook	`client.facebook`
Instagram	`client.instagram`
Reddit	`client.reddit`
TikTok	`client.tiktok`
LinkedIn	`client.linkedin`
YouTube	`client.youtube`
Dcard	`client.dcard`
Google Maps	`client.gmaps`
104 Jobs	`client.job104`

All platforms follow the same pattern:


# User profile
user = client.threads.get_user("zuck")
user = client.x.get_user("elonmusk")
user = client.instagram.get_user("instagram")
 
# Single post
post = client.threads.get_post("DQt-ox3kdE4")
tweet = client.x.get_post("1234567890")
video = client.tiktok.get_video("7123456789")
 
# User posts
posts = client.facebook.get_user_posts("NASA")
posts = client.reddit.get_subreddit_posts("python", sort="hot")

Common Patterns

Search


# Search across platforms
results = client.threads.search_posts("AI")
 
jobs = client.linkedin.search_jobs("software engineer", location="Taiwan")
jobs = client.job104.search_jobs(q="Python工程師")

Auto-Pagination

Methods prefixed with iter_ return iterators that automatically handle cursor-based pagination:


for post in client.threads.iter_user_posts("zuck"):
    print(post.text)
 
for job in client.linkedin.iter_search_jobs("data scientist"):
    print(job.title, job.location)

Iterators are lazy — pages are fetched on demand as you consume results. No data is loaded until you start iterating.

Available iterators:

Platform	Method
Threads	`iter_user_posts()`, `iter_user_replies()`
X	`iter_user_posts()`, `iter_search_posts()`
LinkedIn	`iter_company_jobs()`, `iter_search_jobs()`

Bulk Operations

For fetching many Threads posts at once:


# Submit and wait (blocking)
result = client.threads.bulk_submit_and_wait(
    ["post_id_1", "post_id_2", "post_id_3"],
    poll_interval=2.0,  # seconds between status checks
    timeout=300.0,       # max wait time
)
for post in result.data:
    print(post.text)
 
# Or manage manually
job = client.threads.bulk_submit(["id1", "id2", "id3"])
status = client.threads.bulk_status(job.data.job_id)
results = client.threads.bulk_results(job.data.job_id)

Async Support

All methods have async equivalents via AsyncByCrawl. The API is identical — just await calls and use async for with iterators.

Sync


from bycrawl import ByCrawl
 
client = ByCrawl()
 
user = client.threads.get_user("zuck")
print(user.data.username)
 
for post in client.threads.iter_user_posts("zuck"):
    print(post.text)

Error Handling


from bycrawl import ByCrawl, NotFoundError, RateLimitError, AuthenticationError
 
client = ByCrawl()
 
try:
    post = client.threads.get_post("invalid_id")
except NotFoundError:
    print("Post not found")
except RateLimitError as e:
    print(f"Rate limited — retry after {e.retry_after}s")
except AuthenticationError:
    print("Invalid API key")

Exception hierarchy:

Exception	HTTP Status	Description
`ByCrawlError`	—	Base exception
`APIError`	any	HTTP API error (has `status_code`, `body`)
`AuthenticationError`	401	Invalid or missing API key
`PermissionError`	403	Insufficient scope
`NotFoundError`	404	Resource not found
`RateLimitError`	429	Rate/credit limit exceeded (has `retry_after`)
`ServerError`	5xx	Server-side error
`TimeoutError`	—	Request timed out
`ConnectionError`	—	Network connection error

Configuration


client = ByCrawl(
    api_key="sk_byc_...",                  # or BYCRAWL_API_KEY env var
    base_url="https://api.bycrawl.com",    # default
    timeout=60.0,                           # seconds, default 60
    max_retries=2,                          # auto-retry on 429/5xx, default 2
)

The SDK automatically retries on rate limit (429) and server errors (5xx) with exponential backoff and jitter, and respects the Retry-After header.

TikTok search, comments, and user video endpoints use browser automation and take 20–40 seconds. Use a timeout of at least 60s. See Limitations for details.