Python SDK
The official Python SDK for ByCrawl — typed responses, async support, auto-pagination, and built-in retry logic.
New to ByCrawl? Start with Getting Started to create an account and get your API key.
Installation
Install the SDK
pip install bycrawlSet your API key
Environment Variable
export BYCRAWL_API_KEY="sk_byc_..."from bycrawl import ByCrawl
client = ByCrawl() # picks up from BYCRAWL_API_KEYNever commit your API key to source control. Use environment variables or a secrets manager.
Make your first call
from bycrawl import ByCrawl
client = ByCrawl()
user = client.threads.get_user("zuck")
print(user.data.username) # "zuck"
print(user.data.follower_count) # 3200000
post = client.threads.get_post("DQt-ox3kdE4")
print(post.data.text)
print(post.data.stats.likes)Response Objects
Every SDK method returns an APIResponse[T] object with three parts:
resp = client.threads.get_user("zuck")
# 1. Typed data — full autocomplete in your IDE
resp.data.username # str
resp.data.follower_count # int
resp.data.is_verified # bool
# 2. Rate limit metadata
resp.rate_limit.remaining # requests remaining in window
resp.rate_limit.reset # reset timestamp
# 3. Credit usage
resp.credit.remaining # credits left
resp.credit.used # credits consumed by this requestThe .data field is typed per endpoint, so your IDE provides autocompletion for every field — no need to guess key names or dig through raw JSON.
Platforms
| Platform | Namespace |
|---|---|
| Threads | client.threads |
| X / Twitter | client.x |
client.facebook | |
client.instagram | |
client.reddit | |
| TikTok | client.tiktok |
client.linkedin | |
| YouTube | client.youtube |
| Dcard | client.dcard |
| Google Maps | client.gmaps |
| 104 Jobs | client.job104 |
All platforms follow the same pattern:
# User profile
user = client.threads.get_user("zuck")
user = client.x.get_user("elonmusk")
user = client.instagram.get_user("instagram")
# Single post
post = client.threads.get_post("DQt-ox3kdE4")
tweet = client.x.get_post("1234567890")
video = client.tiktok.get_video("7123456789")
# User posts
posts = client.facebook.get_user_posts("NASA")
posts = client.reddit.get_subreddit_posts("python", sort="hot")Common Patterns
Search
# Search across platforms
results = client.threads.search_posts("AI")
jobs = client.linkedin.search_jobs("software engineer", location="Taiwan")
jobs = client.job104.search_jobs(q="Python工程師")Auto-Pagination
Methods prefixed with iter_ return iterators that automatically handle cursor-based pagination:
for post in client.threads.iter_user_posts("zuck"):
print(post.text)
for job in client.linkedin.iter_search_jobs("data scientist"):
print(job.title, job.location)
Iterators are lazy — pages are fetched on demand as you consume results. No data is loaded until you start iterating.
Available iterators:
| Platform | Method |
|---|---|
| Threads | iter_user_posts(), iter_user_replies() |
| X | iter_user_posts(), iter_search_posts() |
iter_company_jobs(), iter_search_jobs() |
| TikTok | iter_video_comments() |
| 104 Jobs | iter_search_jobs() |
Bulk Operations
For fetching many Threads posts at once:
# Submit and wait (blocking)
result = client.threads.bulk_submit_and_wait(
["post_id_1", "post_id_2", "post_id_3"],
poll_interval=2.0, # seconds between status checks
timeout=300.0, # max wait time
)
for post in result.data:
print(post.text)
# Or manage manually
job = client.threads.bulk_submit(["id1", "id2", "id3"])
status = client.threads.bulk_status(job.data.job_id)
results = client.threads.bulk_results(job.data.job_id)Async Support
All methods have async equivalents via AsyncByCrawl. The API is identical — just await calls and use async for with iterators.
Sync
from bycrawl import ByCrawl
client = ByCrawl()
user = client.threads.get_user("zuck")
print(user.data.username)
for post in client.threads.iter_user_posts("zuck"):
print(post.text)Error Handling
from bycrawl import ByCrawl, NotFoundError, RateLimitError, AuthenticationError
client = ByCrawl()
try:
post = client.threads.get_post("invalid_id")
except NotFoundError:
print("Post not found")
except RateLimitError as e:
print(f"Rate limited — retry after {e.retry_after}s")
except AuthenticationError:
print("Invalid API key")Exception hierarchy:
| Exception | HTTP Status | Description |
|---|---|---|
ByCrawlError | — | Base exception |
APIError | any | HTTP API error (has status_code, body) |
AuthenticationError | 401 | Invalid or missing API key |
PermissionError | 403 | Insufficient scope |
NotFoundError | 404 | Resource not found |
RateLimitError | 429 | Rate/credit limit exceeded (has retry_after) |
ServerError | 5xx | Server-side error |
TimeoutError | — | Request timed out |
ConnectionError | — | Network connection error |
Configuration
client = ByCrawl(
api_key="sk_byc_...", # or BYCRAWL_API_KEY env var
base_url="https://api.bycrawl.com", # default
timeout=60.0, # seconds, default 60
max_retries=2, # auto-retry on 429/5xx, default 2
)The SDK automatically retries on rate limit (429) and server errors (5xx) with exponential backoff and jitter, and respects the Retry-After header.
TikTok search, comments, and user video endpoints use browser automation and take 20–40 seconds. Use a timeout of at least 60s. See Limitations for details.