alt-tab.io

Adventures in context switching

Generated by Midjourney using prompt: 'Robot uses handheld scanner to scan qr code at museum exhibition'

In my journey to develop a retail point-of-sale advertising display for my wife's business (stay tuned for a future blog post!), I've found great support in my workflow from a combination of powerful tools. Primarily, I've been utilizing OpenAI's GPT-4 along with GitHub's dynamic duo of AI-powered solutions: Copilot and Copilot Chat VSCode extensions.

These Generative AI tools have become indispensable to me, significantly enhancing my productivity and efficiency throughout the development process. As a husband and dad with limited time for tinkering, their support has been invaluable. I find myself in a state of flow more often, allowing me to make the most of the precious time I have available.

Regardless of whether I'm asking questions to help gauge the complexity of an idea, double-checking my memory of a concept, asking for the code to a quick functional POC, or working with my code in the editor, these AI tools have been great additions to my toolbox. Most times the AI responses are nearly as accurate as I need, and occasionally spot-on.

Case in point, I was struck with the interest to try developing a simple app that would:

  1. Access the local webcam every few seconds.
  2. Scan the image for a valid QR code
  3. Print the code to the console.

Not having an idea as to where I'd start, except for having an interest in using JavaScript for the challenge, I asked Chat GPT to get me started:

Random inspiration meets GPT-4, results in a tiny Node.js app that uses a webcam to continually read qr codes from the field of view, and prints them to the console. Complete with concise NPM project setup and run instructions, with a troubleshooting section covering expected common issues with discovery and resolution notes.

GPT-4 responded with:

Sure, I can help outline how such a project can be structured. We'll use the qrcode-reader package to read QR codes, node-webcam to access the webcam, and Jimp to process images.

Here is a simple guide to setting up and running this project:

1. Setup:

First, ensure that Node.js and NPM are installed on your machine. If not, visit the Node.js website to download and install the latest version.

Next, create a new directory for your project, navigate into it, and initialize a new NPM project by running:

1
2
mkdir qr-reader && cd qr-reader
npm init -y

Then, install the necessary dependencies:

1
npm install node-webcam qrcode-reader jimp

2. Writing the Application:

Create a new index.js file and open it in your favorite code editor. Here's a simple application that can capture images from your webcam and decode any QR codes found in them:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
const NodeWebcam = require("node-webcam");
const QrCode = require("qrcode-reader");
const Jimp = require("jimp");

// Webcam options
const opts = {
width: 1280,
height: 720,
quality: 100,
delay: 0,
saveShots: true,
output: "jpeg",
device: false,
callbackReturn: "location",
verbose: false
};

// Create the webcam instance
const Webcam = NodeWebcam.create(opts);

function captureAndDecode() {
Webcam.capture('test_picture', function(err, data) {
if(err) {
console.log(err);
return;
}

Jimp.read('test_picture.jpg', function(err, image) {
if(err) {
console.log(err);
return;
}

let qr = new QrCode();
qr.callback = function(err, value) {
if(err) {
console.log(err);
return;
}

console.log(value.result);
console.log('------------------------');
};

qr.decode(image.bitmap);
});
});
}

// Continually capture and decode
setInterval(captureAndDecode, 5000);

This application will take a picture every 5 seconds and attempt to decode any QR codes found in it.

3. Run the Application:

To start the application, use the node command:

1
node index.js

If everything is set up correctly, you should start seeing QR code data printed to your console.

4. Troubleshooting:

  • If the QR codes are not being read properly, try adjusting the webcam resolution in the opts object.
  • If there's an issue with the webcam not being found, ensure it's properly connected and try specifying the device in the opts object.
  • If you're receiving file system errors, ensure your application has permission to read/write in its current directory.
  • If you're getting a MODULE_NOT_FOUND error, ensure all required dependencies are properly installed. If necessary, delete the node_modules directory and the package-lock.json file and reinstall your dependencies with npm install.

Remember to always check the error logs for clues on what could be going wrong. And when in doubt, the Node.js and module documentation, along with search engines and relevant developer communities, can be invaluable resources for troubleshooting.

To my delight, the app ran without a hiccup, and did exactly what I asked.

2023-07-26T203347

Effectively starting and finishing the implementation before I'd even had a chance to search "webcam" on npmjs.com, I spent my time amused instead of challenged.

I'm not sure if I'll use this code for anything, but it was a fun anyway.

Source code for this project is available on GitHub

shutterstock_2233312237

As a software engineer and architect, I have navigated countless debugging mazes to resolve elusive issues in codebases spanning a multitude of programming languages, frameworks, and technologies. A key lesson learned during this journey is that debugging skills, which arguably form the backbone of effective software development, seldom receive the attention they deserve. If anything, I hear and read folks discussing the tools and what to do with them. Here, I share a handful of my own debugging practices in hopes that a tidbit might be beneficial to you when you're neck-deep in code, frustrated and mentally drained.

  1. Embrace the Scientific Method: Observe, Hypothesize, Experiment, Record
    Debugging is more science than art. Adopting the scientific method of making observations, developing hypotheses, conducting experiments, and recording outcomes can keep your debugging structured and productive. Any change you make, any outcome observed, should be recorded - whether in a digital note-taking app, on a piece of paper, or in the form of comments in your code.

  2. Treat Assumptions as Hypotheses
    All developers form assumptions while debugging. Assumptions aren't bad, but remember to treat them as hypotheses that require validation. Assume nothing, question everything. This leads to the next point.

  3. Validate Your Assumptions
    Begin with what you think you know and put your assumptions to the test. If there's a shred of doubt, don't ignore it. Let your curiosity drive you to validate until every doubt is either ruled out or turned into a confirmed fact.

  4. Adjust Based on New Facts
    When you encounter new facts or findings, pause and recalibrate. Revisit your previously validated facts and assumptions to see if anything needs rechecking or additional validation based on your new knowledge.

  5. Make One Change at a Time
    Stick to one change per experimental cycle. It could be a local code change, a config adjustment, or anything else - just ensure you make only one modification per attempt. This avoids the chaos of not knowing which change led to a certain outcome.

  6. Practice Self-Care
    Debugging can be a demanding task, both physically and mentally. Be kind to yourself. Pay attention to your posture, take breaks, get some fresh air, and maybe even catch a few rays of sunlight. The brief distraction might just spark the solution you need.

  7. Talk to Yourself, or a Rubber Duck
    Walking through your problem aloud can provide fresh insights. If talking to yourself seems awkward, try Rubber Duck Debugging. Explaining the issue to an inanimate object (or a willing colleague) can help untangle your thoughts and provide a new perspective.

  8. Don't Hesitate to Ask for Help
    Debugging doesn't have to be a solo journey. Asking for help can bring fresh eyes to the problem, often resulting in quicker solutions. Even the most experienced developers can benefit from a fresh perspective.

  9. Simplify the Reproduction Steps
    If the issue requires a convoluted process to reproduce, pause and think if it can be simplified. Don't hesitate to tweak the upstream code locally for making reproduction easier and quicker.

  10. Use Separate Code Repository Branches for Each Foray
    Having a new local branch for each attempt can be a lifesaver when you need to revisit a previously discarded trial. This prevents the dreaded scenario of having to recreate changes you've painstakingly undone.

  11. Celebrate the Small Wins
    Each little discovery that helps shed light on the issue is a win. Celebrate it by walking away. Allow your mind to absorb the information as you shift your focus for a bit before diving back in.

Wrap-Up

Debugging isn't about being magically quick at finding impossible problems. It's about staying organized, not being afraid to question everything, and remembering to look after yourself and tap into the collective brainpower around you. Use these tips to up your debugging game, solve those tricky software problems faster, and really shine a light on the art of debugging.

Happy Debugging!

2023-05-23T141237

"Copilot" isn't just for code anymore

In my last blog post, I delved into the details of the revolutionary ChatGPT plugin framework. This powerful feature set expanded the capabilities of the ChatGPT model beyond simple conversation and maintaining context, enabling connection with real-world services. Today, during the opening keynote of Microsoft Build 2023, Microsoft unveiled a leap forward in the usage of this framework, broadening its access and applicability. The common plugin platform, initially designed for ChatGPT, will soon extend across several other OpenAI-backed tools, such as GitHub Copilot, Bing Search, the Edge Bing sidebar, Microsoft Teams, Microsoft 365, and more.

Enhanced Platform Extensibility

2023-05-23T141735

"Copilot" will replace the word "assistant" if Microsoft's plans succeed. This unified platform is sold as a game-changer in the realm of AI services. It essentially serves as a common X-as-a-Service plane, integrating the capabilities of various OpenAI-based tools to deliver a more unified and personalized user experience. Here are a few examples showcased at the conference, demonstrating the real-world applications of this platform:

  1. Streamlined Grocery Shopping: If you're browsing a food recipe on your Edge browser, the platform can intelligently extract the ingredients and add them to your connected grocery shopping list service, saving you the hassle of manual entry.

  2. Automated Legal Document Generation: By integrating a legal service with the platform, users can generate a legal document from a template. Copilot assists in iteratively improving the document, taking into account the connected service's understanding of liability and risk specific to a context and jurisdiction.

  3. Boosted Focus Mode: If a user wants to concentrate on their work, the Windows Copilot provides actionable suggestions to activate features such as Windows Snap mode and dark mode, alongside a timer.

  4. Podcast Automation: After recording a podcast, generate transcription; Using the web, generate a bio for the guest. Draft a social media post with cover art, and then posting it to LinkedIn. (see source code)

Azure AI Studio: Power at Your Fingertips

In addition to the extension of the ChatGPT plugin framework, Microsoft Build 2023 also introduced Azure AI Studio. This suite promises a flurry of features, empowering developers to construct and train their large-language models (LLM) using a combination of Azure OpenAI Services and both closed and open models. Key advertised features of Azure AI Studio include built-in vector indexing, the simplification of retrieval-augmented generation, and 'Prompt Workflows', which provide a visual means to design conversation flows (inputs, outputs) with AI. Under the hood we'll find open-source LangChain, Microsoft's Semantic Kernel, and of course numerous Azure-based AI services some of which are powered by OpenAI.

AI safety is to be a fundamental aspect of Azure AI Studio, ensuring AI endeavors remain secure and ethical. "Media provenance tools" are discussed as providing "[c]ryptographic methods used to mark and sign Al-generated content with metadata about its origin, enabling consumers to verify whether an image or video was generated by AI."

Microsoft's "BYO?" approach of Azure and .NET, inviting engineers from many tech stacks to jump in and build, will seemingly be a key component of Azure AI Studio. AI Studio will support a variety of hosted models with available fine-tuning as well as external models. This flexibility will allow us to leverage the power of Azure AI Studio, regardless of preferred or mandated models.

Conclusion

Microsoft is clearly reaping the benefits of their investments in OpenAI. However, there's no time for celebration, as the rapid pace of open innovation continues to keep the pressure on industry leaders like Microsoft and Google. Offering an extensible platform has proven to be a route to success, enhancing the value proposition of the platform for customers. As the customer reaps more benefits, the platform becomes an integral part of their operations.

References

  1. "OpenAI, the artificial intelligence firm originally founded by Elon Musk and Y Combinator's Sam Altman, just landed a $1 billion investment from Microsoft."
  2. "Microsoft and OpenAI will build AI supercomputing technologies on Azure"
  3. "Microsoft and OpenAI announce exclusive GPT-3 licensing deal"
  4. https://www.theverge.com/2020/9/22/21449238/microsoft-openai-exclusive-license-gpt-3-language-model
  5. "Microsoft Corp. is investing $10 billion in OpenAI, whose artificial intelligence tool ChatGPT has lit up the internet since its introduction in November, amassing more than a million users within days and touching off a fresh debate over the role of AI in the workplace."
  6. "Leaked Google engineer memo warns that Big Tech could lose AI race to the little guys"
  7. "Welcome to LangChain"
  8. "What is Semantic Kernel?"

Shiny New Toy

TL;DR: ChatGPT is a fun toy that is getting more powerful by the day. The plugin framework is a shiny new toy that promises to make ChatGPT a powerful tool for automating tasks and connecting to the internet. I'm excited to see where it goes.

Enter Stage Right: ChatGPT's initial plugin framework release promises to dazzle by connecting it to the internet via REST APIs. This is a big deal, and I'm excited to see where it goes. I've been tinkering with the plugin framework for a few days now, and I'm impressed with the simplicity of the implementation and the power of the results. I'm also impressed with the speed at which OpenAI is iterating on the framework, and I'm excited to see what the future holds. I'm not sure if I'm more excited about the possibilities for ChatGPT or the possibilities for the plugin framework itself. I'm sure both will be amazing.

OpenAI plugins connect ChatGPT to third-party applications. These plugins enable ChatGPT to interact with APIs defined by developers, enhancing ChatGPT's capabilities and allowing it to perform a wide range of actions.

OpenAI, Chat Plugins [30 April 2023]

Availability

At the moment, plugin development is a Limited Alpha behind a wait-list and plugins can be shared with up to 15 fellow developers (who also have access to plugins). It's early times for sure, but if the pace of ChatGPT over the past year is any indication we are sure to have a full marketplace of both free and paid plugins within weeks of my writing this blog post.

Architecture

ChatGPT plugins are constructed with:

  1. One or more REST API endpoints implemented in any language and/or framework you prefer: Node.js, Deno, ASP .NET Controllers or Minimal API, Azure Functions, Ruby on Rails, Rust, Go, Bip Functions, AWS Lamdas, Wolfram or whatever else you fancy (COBOL diehards rejoice!). Coded APIs aren't the only option either; you can just a easily use a low-code/no-code platform like Power Apps, Power Automate, Azure APIM, Azure Logic App, Data API Builder, or OutSystems. The only requirement is that your API is accessible via the internet.
    • OpenAI provides a Python-based "TODO API" plugin to get you started with local experimentation. It seems pretty useless until you realize that by tinkering with the inputs and outputs, descriptions, even parameter names, you will begin to witness the power of ChatGPT's language processing capabilities.
  2. A manifest file (yourdomain.com/.well-known/ai-plugin.json) describing the plugin to ChatGPT. The contents and structure of this file are likely to change, but the most critical bits are:
    • Plugin name and description (for display in the ChatGPT UI)
    • Plugin description (for the model to understand at a high level what your plugin does)
    • A logo URL
    • Your contact and legal information
    • Authentication mechanism, if applicable to your API
  3. An OpenAPI specification (YAML) file. ChatGPT's language processing is a "natural" 🤣 at parsing this contract to understand what each API endpoint can do (descriptions, route methods) as well as how the input & output models are shaped (component schemas). OpenAPI isn't to be confused with OpenAI; this specification predates OpenAI and is used for all sorts of things unrelated to AI. It's a well-established standard that is used by many API providers, including Microsoft, Google, and Amazon. This alone is the most exciting/terrifying part - public REST APIs around the globe are already in a position to offer knowledge and actions to ChatGPT, merely by having an OpenAPI specification available. What's more, the same specification supports describing that most common authentication patterns, so you can offer ChatGPT users the ability to authenticate with your API and perform actions on their behalf.

Plugin Flow (Production)

sequenceDiagram
  autonumber
  actor User
  participant Client Browser
  participant OpenAI Servers
  participant Plugin Provider API
  User->>Client Browser: New ChatGPT Chat
  Client Browser->>OpenAI Servers: Register Plugin with User's ChatGPT profile
  OpenAI Servers-->>Client Browser: Inject compact plugin context 
(description & endpoints)
into hidden prompt message Client Browser-->>User: Render UI User->>Client Browser: Prompt activate Client Browser Client Browser->>OpenAI Servers: User Prompt + Plugin context activate OpenAI Servers OpenAI Servers-->>OpenAI Servers: AI Reasoning OpenAI Servers->>Plugin Provider API: Perform commands and/or queries
against Plugin API activate Plugin Provider API Plugin Provider API-->>Plugin Provider API: Plugin logic Plugin Provider API-->>OpenAI Servers: Text, Links or Markdown deactivate Plugin Provider API OpenAI Servers-->>OpenAI Servers: Incorporate Plugin API results into response OpenAI Servers-->>Client Browser: Text & Meta Client Browser-->>User: Render response deactivate OpenAI Servers deactivate Client Browser

Plugin Flow (Local Development)

To enable local development of a plugin, ChatGPT turns the tables and calls your API from the browser. This changes the flow dramatically:

sequenceDiagram
  autonumber
  participant Plugin Provider API
  actor User
  participant Client Browser
  participant OpenAI Servers
  User->>Client Browser: New ChatGPT Chat
  Client Browser->>OpenAI Servers: Register Plugin with User's ChatGPT profile
  OpenAI Servers-->>Client Browser: Inject compact plugin context 
(description & endpoints)
into hidden prompt message Client Browser-->>User: Render UI User->>Client Browser: Prompt activate Client Browser Client Browser->>OpenAI Servers: User Prompt + Plugin context activate OpenAI Servers OpenAI Servers-->>OpenAI Servers: AI Reasoning OpenAI Servers-->>Client Browser: Request Plugin API call Client Browser->>Plugin Provider API: Perform commands and/or queries
against Plugin API activate Plugin Provider API Plugin Provider API-->>Plugin Provider API: Plugin logic Plugin Provider API-->>Client Browser: Text, Links or Markdown deactivate Plugin Provider API Client Browser-->>OpenAI Servers: Relay response to Server OpenAI Servers-->>OpenAI Servers: Incorporate Plugin API results into response OpenAI Servers-->>Client Browser: Text & Meta Client Browser-->>User: Render response deactivate OpenAI Servers deactivate Client Browser

Local "Quickstart" Demo

OpenAI provides a sample TODO API on GitHub so that you can clone or download, and try it out with ChatGPT in a few minutes time. It's implemented in Python, so if you don't already have a local Python development environment setup you'll need to do that.

There are numerous ways of getting Python running locally. For beginners, the article below walks you through setting up Python 3 on your operating system of choice:

Python 3 Installation & Setup Guide

Running the Plugin (API) Locally

Once you have Python set up, execute the TODO API with python main.py and you'll see the server spin up and await calls from ChatGPT. The output should look something like this:

1
2
3
4
5
6
7
8
> python main.py

* Serving Quart app 'main'
* Environment: production
* Please use an ASGI server (e.g. Hypercorn) directly in production
* Debug mode: True
* Running on http://0.0.0.0:5003 (CTRL + C to quit)
[2023-04-30 17:56:58 -0400] [4776] [INFO] Running on http://0.0.0.0:5003 (CTRL + C to quit)

Connect ChatGPT

Getting ChatGPT connected to the local TODO API is pretty straightforward. In a nutshell:

  1. From any ChatGPT chat, change your Model to Plugins and then use the dropdown to navigate to the Plugin Store
  2. Click Develop your own plugin
  3. Supply the URL with port number (such as http://localhost:5003)

Once connected properly, you'll see the logo of your plugin displayed in the "Plugins" dropdown.

Test Drive!

By installing your plugin in the previous step, you're providing ChatGPT with the manifest and OpenAPI specification, both of which inform the model of what your plugin API can do and how to interact.

Let's get the show on the road. I'd like to know what's on my TODO list (it should be empty at the moment), and I'll add some items afterward:

2023-05-15T224510

Okay that's neat. What happens if I try to add more than one item at a time?

2023-05-21T132555

I'm mildly impressed that this prompt did all these things, but I'm less impressed that it required 7 API calls to add 7 items. Why on earth didn't ChatGPT just send them all in one request? Because our API doesn't support that in the request model, and ChatGPT is being a good consumer of the API.

We can refactor the Python code pretty quickly to support this. We'll also need to change the request model to accept a list of items, and then we'll need to change the code to iterate over the list and add each item individually. We'll also need to change the response model to return a list of items. Let's do it...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# main.py

# ...
@app.post("/todos/<string:username>")
async def add_todo(username):
request = await quart.request.get_json(force=True)
if username not in _TODOS:
_TODOS[username] = []

_TODOS[username].append(request["todo"])
return quart.Response(response='OK', status=200)

# ... becomes:
@app.post("/todos/<string:username>")
async def add_todos(username):
request = await quart.request.get_json(force=True)
if username not in _TODOS:
_TODOS[username] = []

if 'todos' in request and isinstance(request['todos'], list):
todos = request['todos']; # extract the list of todos
for todo in todos:
if todo not in _TODOS[username]: # don't add duplicates
_TODOS[username].append(todo)
return quart.Response(response='OK', status=200)
else
return quart.Response(response='Bad Request, todos missing', status=400)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# openapi.yaml

# ...
components:
schemas:
getTodosResponse:
# ...
addTodoRequest:
type: object
required:
- todo
properties:
todo:
type: string
description: The todo to add to the list.
required: true

# ... becomes:

addTodoRequest:
type: object
required:
- todos
properties:
todos:
type: array
items:
type: string
description: The todos to add to the list.
required: true

Let's summarize the changes we made:

  • We changed the code to iterate over the list of todos
  • We tweaked the code to disallow duplicate todos in the datastore (unplanned changed, but a good one). Also unplanned, we added a check to make sure the request contained a list of todos.
  • We changed the request model to accept a list of todos
  • We updated the specification's request model definition and description

And now we can test it out:

2023-05-21T132639

Only one request. Nice! Let's flex our new-found muscles and add support for deleting multiple todos. Our changes will be very similar:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# main.py

# ...
@app.delete("/todos/<string:username>")
async def delete_todo(username):
request = await quart.request.get_json(force=True)
todo_idx = request["todo_idx"]
# fail silently, it's a simple plugin
if 0 <= todo_idx < len(_TODOS[username]):
_TODOS[username].pop(todo_idx)
return quart.Response(response='OK', status=200)

# ... becomes:

@app.delete("/todos/<string:username>")
async def delete_todos(username):
request = await quart.request.get_json(force=True)
if 'todos_idx' in request and isinstance(request['todos_idx'], list):
todos_idx = request["todos_idx"]
for todo_idx in sorted(todos_idx, reverse=True):
if 0 <= todo_idx < len(_TODOS[username]):
_TODOS[username].pop(todo_idx)
return quart.Response(response='OK', status=200)
else:
return quart.Response(response='Bad Request, todos_idx missing', status=400)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# openapi.yaml
delete:
operationId: deleteTodo
summary: Delete a todo from the list
# ...
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/deleteTodoRequest'
# ...
# ...
delete:
operationId: deleteTodos
summary: Delete todos from the list
# ...
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/deleteTodosRequest'
# ...
components:
schemas:
# ...
deleteTodoRequest:
type: object
required:
- todo_idx
properties:
todo_idx:
type: integer
description: The index of the todo to delete.
required: true
# ... becomes:
deleteTodosRequest:
type: object
required:
- todos_idx
properties:
todos_idx:
type: array
items:
type: integer
description: The indexes of the todos to delete.
required: true

Let's summarize the changes we made:

  • We changed the code to iterate over the list of todo indexes
  • We changed the request model to accept a list of todo indexes
  • We updated the specification's response model definition and description

Let's clear out our TODO list all at once:

2023-05-21T132724

I'm pretty forgetful. For instance, I've been known to duplicate items on the grocery shopping list. What would be the outcome with this ChatGPT TODO plugin as it's currently implemented?

2023-05-21T132827

Ugh oh. Our plugin code disregards the duplicate entry, but as far as ChatGPT is concerned all is well up until the point that it loads again from the plugin and compares to its own internal list. We need to provide some clarity to the AI.

The solution is for the plugin to respond to duplicate entry attempts with an informative message for ChatGPT to take into its context:

2023-05-21T132921

ChatGPT understands that dentist can't be added twice, but I'm not happy with its mischaracterization "...was already on your list, so I didn't add it again." Lies! ChatGPT did indeed try to add dentist a second time and was merely prohibited.

I can resolve this by informing ChatGPT duplicates should not be added. This might however result in ChatGPT querying all TODOs before each addition, causing a lot of unnecessary chatter. Instead, I'll tweak the plugin again, this time instructing ChatGPT to update its internal list when conflicts such as this occur:

2023-05-21T132951

The Plugin Store

ChatGPT's Plugin Store already has a wide variety of offerings. In the few days it's taken me to write this blob post, I've seen a lot of new plugins pop up. At the time of this writing, there are 0 plugins available to me as a registered developer:

Wrap-Up

What I've demonstrated here is small potatoes compared to what's possible. From within ChatGPT's UI, we can now make dinner reservations, get prices on flights, and even get a list of apartments for rent in a city of our choice. And that's just the beginning.

OpenAI's ChatGPT plugin framework shows a lot of promise, and it requires us to entertain many new ideas about how AI will interact with us. I've heard people draw comparisons to previous technological advancements, but I have a hard time doing so. This is a new thing, and it's going to take some time to figure out how to use it.

Returning from the philosophical, I see this use of plugins as a way to make ChatGPT more useful in the short term. It's a way to get more out of the platform without having to wait for OpenAI to build it. And it's a way to get more out of the platform by using existing REST APIs that we're already familiar with.

There's a gap, though. The plugins are only available from within the ChatGPT UI. If you're using the OpenAI's REST API or one of the library abstractions, you're out of luck for now. Share your voice on the matter and stay tuned to updates from the OpenAI team to community feedback.

References

  1. OpenAI Platform docs site is a must-read on all matters of interacting with OpenAI's API.
    • Several more plugin examples with source code. Check here for use of OAuth and other authentication methods and giving ChatGPT more memory of conversations as well as semantic search (aka "question answering").
    • Usage Policies includes a section on plugins. It's not mentioned in the enclosed changelog, so either it's been there all along or it's a new addition without notice.
  2. OpenAI Community Forum: "Context length VS Max token VS Maximum length"
0%