Introduction
I’ve been working on a small project to automate a newsletter using AI. The goal is to make something useful for people like me—solopreneurs or small teams who want to share good content without spending hours on it.
The core idea is to pull in articles from RSS feeds and let AI handle ranking and summarizing. This way, the entire workflow—from discovering new content to creating a first draft—can run mostly on autopilot.
This project is a way for me to see how far I can go with AI helping in the background—especially with tasks like ranking content and writing quick summaries. It’s not perfect, but it already saves me time.
Workflow
My goal was to fetch a list of RSS headlines and turn it into a curated, summarized weekly newsletter. It starts by downloading RSS feeds from multiple sources, then ranking and categorizing each headline to generate a weighted score. This score is used to select the top 10 articles for the week—balancing relevance, clarity, and interest. Once selected, each article is summarized using AI, and a short commentary is generated to tie everything together.
Tech Stack:
- Ruby
- OpenAI API
- AWS Lambda
- DynamoDB
Fetching RSS Feeds
I chose RSS feeds for content ingestion because they’re simple and efficient. With just a few HTTP requests, I can pull in hundreds—sometimes thousands—of headlines across multiple sources.
Since this is a weekly newsletter, I want to keep at least 7 days’ worth of content. I noticed that some sources only keep a few days of content in their feeds. I built an AWS Lambda function that runs once per day and saves the feeds to DynamoDB.
DynamoDB might be overkill for this project, but it’s easy to use and it works. I set a Time-To-Live (TTL) of 14 days on each record. That means I can keep a rolling window of content without writing any cleanup logic—DynamoDB takes care of it automatically. In practice, I only need the most recent 7 days of content when generating the newsletter.
One limitation of RSS is that you only get a headline and a short description of the article. That’s not enough to generate full summaries, and I don’t want to waste API calls summarizing everything. So instead, I focus on ranking headlines to select the most interesting content. Not having the full content means I can miss some context, but it’s a good compromise.

RSS Articles Table on DynamoDB
Ranking and Categorization
My goal here was to build a basic way to prioritize interesting posts using a mix of categories and a score system.
First, I created a list of categories relevant to tech, such as:
- AI Trends
- Startup Watch
- Web Development
- Dev Tools
- and more…
Each category gets a weight between 1–5 based on how important or trendy I think it is. Initially, I used OpenAI to score and categorize each headline. This worked great when I had fewer headlines. But once I started pulling from 10+ sources daily, I was getting over 700 headlines per week, and the OpenAI requests started timing out.
Instead of sending 700+ headlines to the LLM, I split the process in two: offline ranking using scoring heuristics, followed by LLM ranking.
Step 1: Scoring Heuristics (Offline Ranking)
Scoring heuristics are simple, rule-based techniques for assigning a score to items based on certain qualities or patterns. They’re a lightweight alternative to machine learning—easy to understand, fast to implement, and surprisingly effective when you know what you’re looking for.
Think of them as a checklist with points. You’re defining what “good” looks like in your context and giving content a score based on how many boxes it ticks.
The following function scores a headline across three areas: strategic keywords, clarity, and engagement. The keyword list can be adjusted over time to improve results.
def score_headline(headline)
score = 0
# Strategic keywords
score += 3 if headline.match?(/(AI|OpenAI|Apple|Google|ChatGPT|Startups)/i)
# Clear and specific
score += 2 if headline.length < 80 && headline.match?(/\b(launch|acquires|raises|update|ban|report)\b/i)
# Engaging format
score += 1 if headline.include?("?") || headline.include?(":")
score
end
I currently have over 100 keywords across five different categories. While this process requires manual input, it’s objective and evolves with time. I could technically select the top 10 headlines at this stage and get decent results—but layering in categorization and LLM scoring helps refine the selection even more.
Step 2: LLM Scoring and Categorization
With the top 100 headlines selected, I send them to the LLM in a single prompt. The prompt asks the model to:
- Assign a category from the predefined list (or suggest a new one, which I may review manually later).
- Score each headline from 1–5 based on:
- Relevance to current trends (1–2 pts)
- Engagement potential (1–2 pts)
- Clarity or originality (1 pt)
- Mark duplicate headlines, keeping the most compelling version.
🔎 Note: I don’t want the model to remove duplicates—just flag them for review.
Here’s how my current prompt looks. It takes a list of headlines in CSV format to minimize token usage. The output is JSON for easy parsing.
You are an expert online content strategist. Process the following list of news headlines. For each headline:
1. Assign one best-fit category from this list (or create a new one if needed):
{CATEGORIES}
2. Assign a score from 1 to 5 based on:
* Relevance to startup/tech readers (1–2 pts)
* Engagement potential (curiosity, emotion, or click-worthiness) (1–2 pts)
* Clarity or originality of the headline (1 pt)
3. For any equivalent or near-duplicate headlines, do the following:
* Keep the most interesting or best-scoring headline and include it in the results
* For all other equivalent or duplicate headlines, add `"ignored": true` to the output
* At least one version of a duplicate headline must have `"ignored": false`
## Headlines
{CSV_HEADLINES}
## Notes
* Output should include categories, scores, and the `ignored` property for all provided headlines
* The `ignored` property should be `true` for duplicates or redundant equivalents and omitted or false for others
* Output SHOULD NOT include ANY markdown or HTML tags
* Focus on tech-savvy and entrepreneur-minded readers
* Be consistent and concise in both your categorization and scoring
## Output
Return only ranked headlines with required index, title, category, score and ignore properties in JSON format.
Article Selection
Now I have a list of 100 headlines, each with a category, relevance score, and duplicate flag.
I run the list through a function to:
- Remove duplicates
- Ignore unknown categories
- Calculate a weighted score
(score * category weight)
- Sort by weighted score
- Allow a maximum of 3 headlines per category
- Pick the top 10 headlines
This gives me a high-quality selection without overusing the OpenAI API.
Generating Summaries
Step 1: Scraping Content
Fetching and cleaning article content could be a post on its own, so I won’t go into too much detail here.
The goal is simple: get the main body text, remove unnecessary HTML, and reduce token usage when passing the content to the LLM.
I used the ruby-readability gem to extract only relevant tags like <p>
, <h2>
, and <ul>
, and strip everything else. This works well with most reputable sources, but could definitely be improved.
Step 2: Summarizing
With the clean article text ready, I use a prompt to generate a short summary or commentary. I limit context to the actual article content to avoid hallucinations. The result is a focused summary—ready for the newsletter.
Here’s the prompt:
You're an expert newsletter editor. Based on the headline and article context provided, generate a short commentary article for a weekly newsletter.
## Input:
**Headline**: {TITLE}
**HTML Context:**:
{HTML_CONTEXT}
NOTE: This may include raw HTML from a blog post or article.
## Instructions:
1. **Generate a new title** for the commentary article. This should reflect a thoughtful editorial perspective related to the original headline and content.
2. **Extract meaningful content** from the provided HTML. Strip all HTML tags and ignore irrelevant elements like navigation, ads, scripts, or unrelated boilerplate.
3. **Summarize and reflect**: Based only on the article’s main content, write **2-3 insightful sentences** that summarize, contextualize, or comment on the piece. The tone should match that of a professional newsletter editor—engaging, informed, and concise.
4. **Follow the style guidelines** listed in the section above
5. **Do not include** the original headline or any parts of the original html context.
6. **Do not include any HTML tags** in the final output.
## Output:
Output should be in JSON format matching the provided schema.
Generating Intro Commentary
Once all summaries are complete, I feed them back into the model one last time. This prompt generates:
- A short introduction for the newsletter
- A thematic or editorial take on the week’s content
This gives the newsletter a polished feel, with a cohesive intro tying everything together—even if it’s AI-generated.
Here’s the prompt:
You're an expert newsletter editor. Based on a set of article summaries from the past week, generate a compelling one-paragraph introduction for a weekly newsletter.
## Input:
**Newsletter Summaries**:
Each summary includes the article's title and a short description. These summaries may cover different topics, but often share a common theme, trend, or tone.
{CONTEXT}
## Instructions:
1. Carefully **read all summaries** and identify any overarching theme, trend, or notable contrasts across the content.
2. **Write 3–4 sentences** that introduce the newsletter to readers. This should:
- Highlight a unifying theme or interesting tension between the articles, if present.
- Be engaging and conversational, targeting a professional but curious tech-savvy audience.
- Create anticipation for what's inside without listing all articles or repeating summaries.
3. **Do not include** a generic welcome message (e.g., “Welcome to this week’s edition…”).
4. **Avoid listing article titles** or copying language directly from the summaries.
5. Keep the tone **warm, confident, and editorial**—like a human editor with personality.
6. **Follow the style guidelines** listed in the section above
## Output:
- Return the final result as a plain-text paragraph (no bullet points, no markdown, no HTML).
- Output should be in JSON format matching the provided schema.
Saving the Newsletter
Finally, I assemble everything and save into a single Markdown file containing:
- Weekly intro / commentary
- 10 summarized articles grouped by category
Checkout a sample markdown file for reference.
I use Buttondown as the newsletter provider because it supports Markdown.
I still review each issue manually before sending. Once I’m happy with the final version, I copy/paste it into Buttondown. Eventually, I’ll automate this via API, but the manual step works for now.
Key Takeaways
Is it a good idea to use an AI newsletter? Maybe.
Full AI is not always the answer. Sometimes it helps to add simple non-AI algorithms to keep things efficient. I use scoring heuristics to filter out low-quality headlines before involving an LLM. That makes the whole workflow faster and cheaper.
It’s also fair to question whether people want to read AI-generated content. That’s still evolving. For me, the most valuable part is having a system that surfaces fresh content and gives me a good starting point. It’s not replacing a human editor—it’s giving me a draft I can work with.
What’s Next?
I’ll be publishing the newsletter weekly, focused on tech and startups. I’ll continue adjusting the balance between manual and AI-generated sections based on what resonates most with readers.
I’m also exploring how this workflow could run on no-code tools like Make and Zapier. These platforms make it accessible for non-developers to build similar setups. I’ll be testing a few variants and documenting what works.
Although my current focus is tech, the same setup could work for almost any topic—as long as you can find good RSS sources.
Conclusion
What started as a simple experiment turned into something I plan to use every week.
The current setup isn’t perfect. I’d like to improve how I fetch and summarize content—especially getting better context from article sources. But even as-is, it saves time and gives me a solid structure for building content.
While I’m keeping the code private for now, I’m happy to share more details. If you’re curious about automating your own newsletter—or want help setting it up—feel free to reach out.