Building a Free, Multi-User Telegram Bot: When Infrastructure Constraints Drive Architecture

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • MyrinNew
    Senior Member
    • Feb 2024
    • 3799

    Building a Free, Multi-User Telegram Bot: When Infrastructure Constraints Drive Architecture

    The Problem Space

    At 2AM with 43% battery and no power, I needed to build a system that could:
    • Send randomized messages to multiple users throughout the day
    • Scale to handle arbitrary user counts
    • Cost exactly $0 to run
    • Deploy and forget about it


    The obvious solution—Twilio's WhatsApp API—sat behind a paywall. But constraints breed creativity, and what followed was an exercise in building production-grade infrastructure with free-tier services.


    Architecture Overview

    The final system consists of three core components:


    1. Multi-User Bot with Individual Scheduling





    # Each user gets their own schedule, persisted in JSON
    users = {
    "chat_id": {
    "active": True,
    "messages_per_day": 3,
    "start_hour": 8,
    "end_hour": 22
    }
    }







    2. APScheduler for Randomized Delivery





    def schedule_user_affirmations(chat_id, messages_per_day, start_hour, end_hour):
    for i in range(messages_per_day):
    random_hour = random.randint(start_hour, end_hour - 1)
    random_minute = random.randint(0, 59)

    scheduler.add_job(
    send_affirmation_to_user,
    'cron',
    args=[chat_id],
    hour=random_hour,
    minute=random_minute,
    id=f'user_{chat_id}_msg_{i}'
    )







    3. Webhook-Based Deployment

    Polling vs. webhooks became critical for deployment. Telegram's API allows only one active connection per bot, which creates an interesting constraint when deploying.


    The Polling Problem

    Initial implementation used infinity_polling():






    # Works locally, breaks in production
    bot.infinity_polling()







    Error:






    ApiTelegramException: Error code: 409.
    Description: Conflict: terminated by other getUpdates request







    This happens because:

    1. Local instance starts polling
    2. Deployed instance starts polling
    3. Telegram sees two connections and terminates the newer one
    4. Both instances keep retrying, creating a conflict loop


    Solution: Webhook Architecture





    if WEBHOOK_URL:
    # Production: Telegram pushes updates to us
    webhook_url = f"{WEBHOOK_URL}/{BOT_TOKEN}"
    bot.remove_webhook()
    bot.set_webhook(url=webhook_url)
    else:
    # Development: We poll Telegram for updates
    bot.infinity_polling()







    Flask endpoint to receive webhooks:






    @app.route(f'/{BOT_TOKEN}', methods=['POST'])
    def webhook():
    if request.headers.get('content-type') == 'application/json':
    json_string = request.get_data().decode('utf-8')
    update = telebot.types.Update.de_json(json_string)
    bot.process_new_updates([update])
    return '', 200
    return '', 403







    Why This Matters

    Polling (Development):
    • Bot continuously asks Telegram: "Any new messages?"
    • Simple, works for local testing
    • Cannot coexist with other instances


    Webhooks (Production):
    • Telegram sends messages directly to your server
    • More efficient (no constant polling)
    • Multiple environments can coexist (different webhook URLs)
    • Production-grade approach


    State Management

    User preferences persist across restarts using JSON:






    def load_users():
    try:
    with open(USERS_FILE, 'r') as f:
    return json.load(f)
    except FileNotFoundError:
    return {}

    def save_users(users):
    with open(USERS_FILE, 'w') as f:
    json.dump(users, f, indent=2)







    Trade-offs considered:
    • Redis/PostgreSQL: Requires additional services, kills free-tier budget
    • SQLite: Better for production, but adds complexity
    • JSON file: Simple, sufficient for <1000 users, zero infrastructure cost


    For a constraint-driven project, JSON files are appropriate. The system can always migrate to a database when scale demands it.


    Deployment: Free Tier Engineering

    Platform: Render.com

    Why Render:
    • True free tier (no credit card required)
    • Auto-deploys from GitHub
    • Includes SSL/HTTPS (required for Telegram webhooks)
    • Provides a persistent URL


    Configuration (render.yaml):






    services:
    - type: web
    name: affirmations-bot
    runtime: python
    buildCommand: pip install -r requirements_telegram.txt
    startCommand: python telegram_app_webhook.py
    envVars:
    - key: TELEGRAM_BOT_TOKEN
    sync: false
    - key: WEBHOOK_URL
    sync: false







    The Free Tier Caveat

    Render's free tier spins down after 15 minutes of inactivity. For a bot that needs to send scheduled messages, this is a problem.


    Solution: UptimeRobot
    • Free monitoring service
    • Pings your app every 5 minutes
    • Keeps the dyno awake
    • Zero cost




    GET https://affirmations-bot.onrender.com/health
    Every 5 minutes







    Scheduling Architecture

    Daily reschedule pattern prevents predictability:






    def reschedule_all_users():
    """Runs at midnight, generates new random times"""
    users = load_users()
    for chat_id, user_data in users.items():
    if user_data.get('active', True):
    schedule_user_affirmations(
    int(chat_id),
    user_data.get('messages_per_day', 3),
    user_data.get('start_hour', 8),
    user_data.get('end_hour', 22)
    )

    # Add to scheduler
    scheduler.add_job(
    reschedule_all_users,
    'cron',
    hour=0,
    minute=1,
    id='daily_reschedule'
    )







    Result:
    • User receives 3 messages daily
    • Times randomized each day (e.g., 9:23, 14:47, 19:12)
    • No predictable patterns
    • Feels organic, not automated


    User Experience Design

    Bot commands follow Telegram conventions:






    @bot.message_handler(commands=['start'])
    def send_welcome(message):
    # Auto-subscribe new users
    # Generate initial schedule
    # Send welcome message

    @bot.message_handler(commands=['settings'])
    def show_settings(message):
    # Display current config
    # Provide customization options

    @bot.message_handler(commands=['pause', 'resume'])
    def toggle_subscription(message):
    # User controls their subscription
    # Preserves preferences for resume







    Key insight: Don't over-engineer. Users want:

    1. /start → immediate value
    2. /settings → control
    3. /pause → temporary opt-out (not deletion)


    Technical Challenges & Solutions

    Challenge 1: Timezone Handling

    Users in different timezones need messages at their local hours.


    Current solution: Server time + user-specified hours






    start_hour = 8 # 8 AM server time







    Future enhancement:






    user_timezone = pytz.timezone(user_data.get('timezone', 'UTC'))
    local_time = datetime.now(user_timezone)







    Challenge 2: Message Deduplication

    With random scheduling, messages could theoretically collide.


    Solution: APScheduler's job IDs prevent duplicates:






    id=f'user_{chat_id}_msg_{i}' # Unique per user, per message slot







    Challenge 3: State Corruption

    What if the server crashes mid-write?


    Mitigation:






    def save_users(users):
    # Atomic write pattern
    temp_file = USERS_FILE + '.tmp'
    with open(temp_file, 'w') as f:
    json.dump(users, f, indent=2)
    os.replace(temp_file, USERS_FILE) # Atomic on POSIX







    Cost Breakdown

    Messaging API Telegram Bot API $0
    Hosting Render.com $0
    Uptime Monitoring UptimeRobot $0
    Version Control GitHub $0
    Total $0/month


    Twilio equivalent: ~$0.005/message = $0.015/day/user = $0.45/month/user


    At 100 users: $45/month vs. $0.


    Performance Characteristics

    Single instance handles:
    • ~100 concurrent users comfortably
    • ~300 messages/day (3 per user)
    • ~0.5 requests/second average
    • Peaks during scheduling windows


    Bottlenecks:

    1. Telegram API rate limits (30 messages/second)
    2. Render free tier CPU/memory
    3. JSON file I/O (becomes issue >1000 users)


    Scaling path:
    • Migrate to PostgreSQL (~1000 users)
    • Horizontal scaling with Redis queue (~10k users)
    • Switch to paid Render tier (~100k users)


    Lessons from Constraint-Driven Development

    1. Start with the Free Tier

    Don't prematurely optimize for scale you don't have. JSON files work until they don't.


    2. Understand Your Platform's Execution Model

    Polling vs. webhooks isn't just a technical detail—it's the difference between working and not working in production.


    3. Constraints Force Better Architecture

    No database? You design for minimal state. No always-on hosting? You make your app stateless and resilient.


    4. Documentation as Infrastructure

    Half the battle is making it reproducible:






    git clone repo
    pip install -r requirements.txt
    # Add bot token to .env
    python telegram_app_webhook.py







    If it takes 5+ steps, you're doing it wrong.


    The Meta-Problem: Environment Parity

    Building from Lagos means:
    • Intermittent power → Local development gets interrupted
    • Slow/expensive internet → Downloading dependencies is costly
    • Limited payment options → Many services unavailable
    • Time zone challenges → Debugging with US-based support


    These aren't excuses—they're parameters. Good engineering adapts.


    Development environment:






    Power: 43% battery, no outlet
    Internet: 3G tethered from phone
    Time: 2:47 AM
    Deadline: Yesterday







    Production environment:






    Power: ✓ Always on
    Internet: ✓ High bandwidth
    Time: ✓ 24/7 availability
    Cost: $0 (hard constraint)







    The gap between these environments shapes the architecture. You build:
    • Offline-first documentation (can't count on Stack Overflow loading)
    • Minimal dependencies (pip install takes forever)
    • Aggressive caching (can't re-download on every restart)
    • Robust error handling (can't debug when offline)


    What's Next

    Immediate improvements:

    1. Add timezone support per user
    2. Implement message templates (user-customizable)
    3. Add analytics dashboard (messages sent, active users)


    Future architecture:

    1. Migrate to PostgreSQL when users > 500
    2. Add message queue (Celery + Redis) for reliability
    3. Implement A/B testing for message timing
    4. Add web interface for non-Telegram management


    System evolution pattern:






    JSON file → SQLite → PostgreSQL → Distributed DB
    Single instance → Load balanced → Microservices
    Monolith → Modular monolith → Services







    Migrate when the pain exceeds the migration cost. Not before.


    Code Repository

    Full implementation: [GitHub link]


    Stack:
    • Python 3.11
    • Flask (web server)
    • pyTelegramBotAPI (Telegram SDK)
    • APScheduler (job scheduling)
    • Render (hosting)


    To deploy your own:






    git clone [repo]
    pip install -r requirements_telegram.txt
    # Add TELEGRAM_BOT_TOKEN to .env
    python telegram_app_webhook.py







    Closing Thoughts

    The "right" solution isn't always the obvious one. When Twilio was gated, I could have:

    1. Paid for it (out of budget)
    2. Given up (not an option)
    3. Found another way (what actually happened)


    Engineering isn't just about writing code—it's about navigating constraints, making trade-offs, and shipping despite the environment.


    Resource-lean contexts don't produce worse engineers. They produce engineers who:
    • Understand trade-offs deeply
    • Build resilient systems by default
    • Know when "good enough" is actually good enough
    • Can build production infrastructure for $0


    The feature shipped. The users are happy. The cost is zero.


    That's the only metric that matters.





    Technical Stack:
    • Telegram Bot API (webhooks)
    • Flask (HTTP server)
    • APScheduler (cron-like scheduling)
    • Render.com (PaaS hosting)
    • GitHub (CI/CD via git push)


    Performance:
    • 100 users, 3 messages/day = 300 messages/day
    • ~0.5 req/sec average
    • <100ms p99 latency (webhook processing)
    • Zero cost at any scale under 10k users


    Want to build something similar?

    The repository is open source. Fork it, modify it, deploy it. All the infrastructure patterns are reusable.


    Sometimes the best technology is the free technology you can ship today. Please try it here: https://web.telegram.org/k/#@my_affirmation_fr_bot





    Written at 4:23 AM, 19% battery remaining, on generator power. Deployed successfully to production before the power cut out again.




    More...
Working...
X