Quick Navigation: If you're just interested in the data analysis and visualizations, skip to the charts.

Unseen Opportunities: Upwork + GPT

How I analyzed 100,000+ freelance job posts with AI to uncover hidden startup opportunities

12 min read

Summer, 2022. I quit my lucrative AI job not out of dissatisfaction, but out of exhilaration. My past couple of years were brimming with happiness, which sparked an awakening within me: if I did not leap into the unknown, doing so might eventually become difficult or even impossible. Within two hours of this epiphany, I sent my company 2-week notice.

Prior to my return to AI, I filled various roles - a mover, a photographer, a dating coach. While they were entertaining diversions, AI remained my true love, leading me to freelance and consult in the space. Despite exploring a variety of avenues, it was Upwork that stood out, it consistently delivered quality clients and became the major source of my income. It allowed me to travel around South America and have incredible adventures. One day, I was jogging in Medellin, and I realized without a shadow of a doubt that I am ready to build a startup. I started reading books, building prototypes, and talking to a lot of people in entrepreneurship and tech. One theme kept coming up, and Naval Ravikant put it succinctly: "You will get rich by giving society what it wants but does not yet know how to get. At scale."

I started looking for problems to be solved. Then, I remembered that there is a vast pool of real-life business problems I have been sitting on: Upwork. If a business does not have the capacity to solve a problem internally, they look for help externally. Unlike something like surveys, client will post their problem only if they are truly in need of a solution. Upwork is a goldmine of business opportunities, you just need little help form AI to extract them.

The Space of Startup Ideas

Job posts are typically rich in textual data detailing candidate preferences, company profiles, and problem statements. Doing analytics with text has not been very practical, but with the advent of powerful GPTs, we can now extract substantial structured data.

Data Collection

I started gathering posts in real time and amassed over 100,000 job posts in just a few weeks. It took 2 days to prompt engineer something that could extract customer pain points accurately.

GPT Prompt Engineering

Here's the prompt I used to extract structured data from job posts (few-shot examples excluded):

Click to view the GPT prompt
meta_prompt = """
    Job Post :
    ```
    {job_description}
    ```
    You are a product-market-fit expert that follows the 80/20 rule, has an engineering mindset, and understands customer problems very deeply. Give non-BS, non-corporate response. 
    Return a list of tags that apply to the job posting. Chose the tags that apply to the job post from below:
    * Talent: If the post is describing the skills, experiences, or any other attributes they are looking for, or describes the responsibilities/requirements for the talent, they will go to this field.
    * Client: if the post talks about the poster (client). Could be a company, org, or individual, put that information in this field.
    * Problem: If the post describes a specific pain point or problem the customer has that talent can address, explain it here. This is an important field. Problem is something to be solved by the talent, so "looking for" / "seeking" must NOT be in this field. Again, the problem is not finding talent or hiring, but a problem to be solved by the talent. Do not include "looking to hire", "needs to hire" etc. in this field, just the client's problem to be solved by someone they would hire.
    * Implied Problem: If the post does not explicitly specify the problem, you can imply the problem,  this might be some more deep-rooted problem that was not mentioned in the post. Seeking talent is not a Problem. 
    * Client Solution: If the client already knows the solution, explain it here. Again, the solution is not finding talent or hiring, but a solution to be implemented by the talent.
    * Outcome: What's the business outcome if the problem and implied problems are solved?

    Respond in JSON format, for example:
    ```
    {{ "Talent": "...", "Client": "...", Problem:"...", "Implied Problem": "..." }}
    ```
    Only add keys if the attribute applies. The values are strings with your explanation and reasoning for the chosen attribute, so the entire response is in JSON format. Only respond with the JSON.
...
    """

OpenAI's API updates came in perfectly! I was able to use their 16k context model to sample large chunks of client problems and start creating taxonomies.

Creating Problem Taxonomies

Click to view the taxonomy creation code
system_message = SystemMessage(content="You are a successful serial entrepreneur.")
prompt_template = """
You are tasked with creating high level taxonomy.
These business problems are separated by |
```
{descs}
```

Create high-level categories for these problems. You must create as few categories as possible, every problem problems should fall into at least of the categories.
You are not classifying each job. You are summarizing the problems into high-level categories.
Respond with a JSON list of categories:
```json
["...", "...", "...", ...]
```
"""

After sampling the majority of the dataset, I further distilled it and requested GPT-4 to provide the final categories. Now we can visualize the categories of problems in relation to volume, investment, and customer budget.

Data Visualizations

Problem Categories by Budget & Volume

The bubble size represents the count of job postings. X-axis shows average hourly pay, Y-axis shows average client spending on Upwork. This reveals how much people are willing to pay to solve problems in each category.

Semantic Problem Space

I created vector embeddings of all problems and reduced their dimensions using UMAP. Explore the semantic space below - hover over points to read the problem descriptions.

Key Insight

This market analysis helped me pinpoint the ideal sector for building my initial startup: financial technology.