What if email worked like ChatGPT? - Software Engineering Blog

I built a prototype to see if it is a good idea.

Despite feeling outdated, email remains one of the most enduring and essential communication channels. Almost 4.48 billion users worldwide sent over 361 billion messages daily in 2024 alone¹. Yet, while our inboxes teem with activity, the experience is not very modern. What if email could feel a bit more like 2025?

The original idea

My idea was very simple: I wanted to be able to email an AI assistant from any email client and get a meaningful response. One practical use case I envisioned was forwarding a long email conversation and receiving a concise, bullet-point summary.

I do not want to use email client plugins because I have a wide variety of clients across different computers and phones. I also do not want to give direct access to my inbox to any external software service and worry about privacy, terms of service, and so on.

How others are using it

However, as I shared the project with friends, they used it in completely different ways. Some treated it like ChatGPT over email, sending it questions out of curiosity. Others started role-playing with the AI, testing its creativity. A few even used it to brainstorm ideas or assist with writing. What began as a productivity tool quickly evolved into something more conversational and interactive, blurring the lines between email and AI-driven chat.

By now, I had gathered several feature requests that warranted deeper exploration and experimentation. But before diving into those, let’s first break down the technical foundation of how this AI-powered email assistant works so far.

The nerdy stuff: bringing AI to email

For a while now, I’ve been experimenting with running AI models locally on my desktop PC. The experience has been a mix of hits and misses, depending on the model’s efficiency and accuracy. Some models are too slow, while others consume too many resources.

For this project, I tested Llama 2, DeepSeek R1, and finally settled on Llama 3.1 (8B). It runs comfortably on my PC without causing significant slowdowns and delivers surprisingly strong results (sometimes on par with ChatGPT-4o). This balance between performance and quality made it the ideal choice for powering my email assistant.

How It Works: AI-Powered Email Automation

To make this AI email assistant function, I wrote a simple Python script using just two packages:

ollama (for getting AI-generated responses)
imapclient (for fetching unread emails)

Step 1: Fetching Unread Emails

The script first checks the inbox for unread messages at regular intervals:

def fetch_unread_emails():
    try:
        mail = imaplib.IMAP4_SSL(IMAP_SERVER)
        mail.login(EMAIL_ADDRESS, EMAIL_PASSWORD)
        mail.select("inbox")

        status, messages = mail.search(None, "UNSEEN")
        email_ids = messages[0].split()

        emails = []
        for e_id in email_ids:
            status, msg_data = mail.fetch(e_id, "(RFC822)")
            for response_part in msg_data:
                if isinstance(response_part, tuple):
                    msg = email.message_from_bytes(response_part[1])
                    sender = msg["From"]
                    subject = msg["Subject"]
                    body = ""

                    if msg.is_multipart():
                        for part in msg.walk():
                            if part.get_content_type() == "text/plain":
                                body = part.get_payload(decode=True).decode()
                    else:
                        body = msg.get_payload(decode=True)

                    emails.append((sender, subject, body))

        mail.logout()
        return emails
    except Exception as e:
        print("Error fetching emails:", e)
        return []

Key Consideration: Email providers rate-limit frequent polling, so I call this function every 5 minutes to avoid triggering restrictions. Some providers are stricter than others, so it’s important to follow their specific guidelines. There are also other options besides polling on an interval, but I decided to keep it simple for now.

Step 2: Generating an AI Response

Once an email is retrieved, the script forwards it to the AI model for processing:

def generate_ai_response(prompt):
    try:
        response = ollama.chat(model="llama3.1", messages=[{"role": "user", "content": prompt}])
        full_response = response["message"]["content"]
        filtered_response = re.sub(r'<think>.*?</think>', '', full_response, flags=re.DOTALL)
        
        return filtered_response.strip()
    except Exception as e:
        print("AI Error:", e)
        return "I'm sorry, but I couldn't process your request."

Why filter <think> tags? Some AI models return meta-thinking process annotations and removing them makes the response cleaner. (this is not applicable to llama3.1)

Step 3: Sending a Reply

Once the AI generates a response, it’s automatically sent back to the original sender:

def send_email(recipient, subject, body):
    try:
        server = smtplib.SMTP(SMTP_SERVER, SMTP_PORT)
        server.starttls()
        server.login(EMAIL_ADDRESS, EMAIL_PASSWORD)

        msg = MIMEText(body)
        msg["From"] = EMAIL_ADDRESS
        msg["To"] = recipient
        msg["Subject"] = f"Re: {subject}"

        server.sendmail(EMAIL_ADDRESS, recipient, msg.as_string())
        server.quit()
        print(f"Replied to {recipient}")
    except Exception as e:
        print("Error sending email:", e)

Tying It All Together

These three functions run inside a continuous loop that:

Sends replies automatically
Checks for new emails
Generates AI responses

What works and what doesn’t

This simple experiment worked surprisingly well for initial emails. It could summarize long conversations, answer questions, and brainstorm ideas. But as soon as a reply chain started, things got messy.

Since my prototype doesn’t store context, every new email feels like the start of a completely fresh conversation. The result? Confused responses and lack of continuity. Here is an example:

To enable meaningful conversations over email, I need a way to preserve context across messages. A few possible solutions:

Include previous emails in each prompt
- A quick fix: attach chunks of past messages when calling ollama.
- The downside? This increases the prompt size and may hit token limits.
Use a vector database + embeddings
- Store past emails as embeddings and retrieve relevant context dynamically.
- This would allow AI to “remember” things across email threads (ex: if you sent your resume last week, it could refer back to it in future conversations).

I’ll be experimenting with these approaches next.

The Biggest Challenge: Avoiding the Spam Folder

Likely the hardest part (one I am most concerned about) is overcoming email spam filters. I have already experienced this once:

Right now, I don’t fully understand how email providers calculate spam scores, but I suspect a few things are working against me:

Automated responses might trigger spam detection. (too quick given the size of the response maybe?)
Lack of a reputation. Since this is a brand-new email sender, it’s not on any “trusted” lists yet.

I’ll need to dig deeper into how spam filters work and what I can do to improve deliverability. Maybe whitelisting my bot’s email, tweaking the email format, or setting up proper SPF/DKIM records could help.

To be continued…

Sources

Email Statistics Report, 2023-2027 https://www.radicati.com/wp/wp-content/uploads/2023/04/Email-Statistics-Report-2023-2027-Executive-Summary.pdf ↩︎