AI (Artificial Intelligence aka Imitation Intelligence) aka Slop¶
Things are moving fast, Stable Diffusion, ChatGPT etc.
- How to Use AI to Do Stuff: An Opinionated Guide
- openai/openai-cookbook: Examples and guides for using the OpenAI API
- GPT has entered the chat with Shawn "swyx" Wang (The Changelog #519) |> Changelog
- LLMs break the internet with Simon Willison (Changelog Interviews #534) |> Changelog
- Simon Willison’s Weblog | Items tagged chatgpt, openai
Slop is the new name for unwanted AI-generated content https://simonwillison.net/2024/May/8/slop/
Criticism¶
AI will have eaten all our hobbies long before it fired us from our job. -- The End of Writing
The following is a summary of the article Is AI the Paperclip? by Nicholas Carr. The summary is made by a chrome extension that I wrote to parse and summarize articles with AI (google/gemini-3-flash-preview in this case). At the end the parsed out content is almost as long as the article. I read both. Is it a deficit to not come up with the key points and the timeline myself? In anyway - it's horrifying.
Tech leaders embrace exponential resource consumption to maintain linear artificial intelligence performance gains¶
TL;DR (120 words): Nick Bostrom’s 2003 "paperclip maximizer" thought experiment warned that an AI might destroy the world while chasing a single goal. Today, this scenario applies to the humans building AI. Current neural networks follow a logarithmic scale: maintaining steady improvement requires exponential increases in resources. To gain tiny advantages, executives like OpenAI’s Sam Altman and xAI’s Elon Musk are committing massive amounts of energy, water, land, and chips to development. Musk recently merged xAI with SpaceX, aiming for space-based scaling. This monomaniacal pursuit suggests that industry leaders will exhaust Earth’s resources and move into space to sustain AI growth, treating all physical and digital assets as raw material for a winner-take-all race.
Key Points (5 bullets, each ≤20 words):
- Nick Bostrom’s 2003 theory describes an AI destroying the world to maximize paperclip production through single-minded resource harvesting.
- Current AI development requires exponential resource increases to achieve only linear improvements in model performance and intelligence.
- Sam Altman identifies the intelligence of a model as the logarithm of resources used to train and run it.
- Tech leaders compete for winner-take-all rewards by devoting energy, water, and specialized chips to marginal scale advantages.
- Elon Musk merged xAI with SpaceX to pursue space-based AI scaling, extending resource harvesting beyond Earth's limits.
Timeline (ISO dates):
- 2003-01-01 — Nick Bostrom publishes "Ethical Issues in Advanced Artificial Intelligence" introducing the paperclip maximizer.
- 2023-09-01 — OpenAI CEO Sam Altman observes that AI intelligence equals the log of resources used.
- 2024-05-18 — Elon Musk announces merger of xAI into SpaceX to facilitate space-based AI scaling.
Notable Quotes (speaker — quote ≤20 words):
- Nick Bostrom — "a superintelligence whose top goal is the manufacturing of paperclips... starts transforming first all of earth..."
- Sam Altman — "The intelligence of an AI model roughly equals the log of the resources used to train and run it."
- Donald MacKenzie — "The more resources you put in, the better the results, but the rate of improvement steadily diminishes."
- Elon Musk — "In the long term, space-based AI is obviously the only way to scale."
Numbers to know (value — what it measures):
- Logarithmic — The function characterizing the relationship between AI intelligence gains and resource inputs.
- Exponential — The rate of resource growth required to maintain linear AI performance improvements.
Implications (next 0–3 months): Executives will likely increase capital expenditures on energy, water, and hardware to sustain marginal gains in AI model performance.
What's missing/uncertain: The specific monetary or social cost thresholds at which current AI scaling becomes unsustainable are not defined.
Source meta: New Cartographies; Nicholas Carr; 2024-05-24 (ISO, localize to Europe/Berlin); Not stated.
Gemini Nano in Chrome¶
API is WIP and the following links might be outdated!
2024-11-12 Getting started with window.ai in Chrome Canary — in browser Gemini LLM – OUseful.Info, the blog…
Getting Started: window.ai in Chrome | by Chris McKenzie | Medium
OpenAI API¶
2024-05-20 - they changed a lot in billing and API key setup (needs projects and organisations and stuff)
When using the OpenAI API, user input doesn't end up in OpenAI training data (https://openai.com/policies/api-data-usage-policies). This is different from ChatGPT, where user input might be used for training the model!
Starting on March 1, 2023, we are making two changes to our data usage and retention policies:
- OpenAI will not use data submitted by customers via our API to train or improve our models, unless you explicitly decide to share your data with us for this purpose. You can opt-in to share data.
- Any data sent through the API will be retained for abuse and misuse monitoring purposes for a maximum of 30 days, after which it will be deleted (unless otherwise required by law).
Perplexity AI¶
Better at knowledge retrival and less a content creator.
Anthropic AI Claude¶
Chat: https://claude.ai/ Developer: https://console.anthropic.com/dashboard
OpenRouter¶
A unified interface for LLMs - Find the best models & prices for your prompts
MS Copilot¶
Use it in Edge for work related Office 365 stuff. That seems to be the most appropriate.
Obsidian and Open AI¶
Copilot¶
Too much to setup (with OpenRouter) but it's good.
Smart Connections Plugin¶
Smart Connections is an AI-powered plugin designed to democratize access to AI technology and empower individuals with innovative tools. It features features like Smart View and Smart Chat, making it easier than ever to stay organized and uncover hidden connections between notes.
Smart Connection uses OpenAI API Embeddings for search and make connections
VSCode¶
Sourcegraph Cody¶
Using this right now (2024-06-14) and it got better. Using Claude 3 and GPT-4o is alright and it's a bit cheaper than other services. Chat is pretty good. New context source too.
Phind¶
Started as ChatGPT interface for coding questions, became a VSCode extension to interact with the chat so it's no autocomplete and more a glorified chat interface with some sort of context awareness (referencing open files in the project). They have a usage limit for their GPT-4 model and they have their own model. And I have to say - this came up with the best solutions so far if I used it as a rubberduck.
Rubberduck¶
Just an integration of Open AI API, fine-tuned for coding, covering the usual tasks like documenting, explaining, chat and generating code. The latter all without context or only the limited context you can sent over. Therefore, it's a rubberduck to bounce of ideas in the chat and and getting some extra help here and there to write self-containing code snippets.
Github Copilot¶
Can't login for some reason.
Tab 9¶
Unlimited free version with very short autocompletion and 15 USD with more complex autocompletion and chat and all that jazz.
- Used the paid version for a bit. Autocomplete wasn't overwhelming, often accurate, but no mind reading here either.
- The chat was quite disappointing and often very basic answers. I expected more since Tab 9 is indexing the codebase. Maybe I used it wrong!
- Nevertheless offers Tab 9 some good things and privacy for the code
Rift¶
Not tested yet
https://github.com/morph-labs/rift
Continue¶
Not tested yet
https://github.com/continuedev/continue
Code GPT¶
Not tested, probably never will
Prompts¶
Super Prompt¶
Output an overview of every single dimension of my request. Find points of uncertainty. Then, ask me as many clarifying questions as possible.
Cleanup OCR¶
async def process_chunk(chunk: str, prev_context: str, chunk_index: int, total_chunks: int, reformat_as_markdown: bool, suppress_headers_and_page_numbers: bool) -> Tuple[str, str]:
logging.info(f"Processing chunk {chunk_index + 1}/{total_chunks} (length: {len(chunk):,} characters)")
# Step 1: OCR Correction
ocr_correction_prompt = f"""Correct OCR-induced errors in the text, ensuring it flows coherently with the previous context. Follow these guidelines:
1. Fix OCR-induced typos and errors:
- Correct words split across line breaks
- Fix common OCR errors (e.g., 'rn' misread as 'm')
- Use context and common sense to correct errors
- Only fix clear errors, don't alter the content unnecessarily
- Do not add extra periods or any unnecessary punctuation
2. Maintain original structure:
- Keep all headings and subheadings intact
3. Preserve original content:
- Keep all important information from the original text
- Do not add any new information not present in the original text
- Remove unnecessary line breaks within sentences or paragraphs
- Maintain paragraph breaks
4. Maintain coherence:
- Ensure the content connects smoothly with the previous context
- Handle text that starts or ends mid-sentence appropriately
IMPORTANT: Respond ONLY with the corrected text. Preserve all original formatting, including line breaks. Do not include any introduction, explanation, or metadata.
Previous context:
{prev_context[-500:]}
Current chunk to process:
{chunk}
Corrected text:
"""
ocr_corrected_chunk = await generate_completion(ocr_correction_prompt, max_tokens=len(chunk) + 500)
processed_chunk = ocr_corrected_chunk
# Step 2: Markdown Formatting (if requested)
if reformat_as_markdown:
markdown_prompt = f"""Reformat the following text as markdown, improving readability while preserving the original structure. Follow these guidelines:
1. Preserve all original headings, converting them to appropriate markdown heading levels (# for main titles, ## for subtitles, etc.)
- Ensure each heading is on its own line
- Add a blank line before and after each heading
2. Maintain the original paragraph structure. Remove all breaks within a word that should be a single word (for example, "cor- rect" should be "correct")
3. Format lists properly (unordered or ordered) if they exist in the original text
4. Use emphasis (*italic*) and strong emphasis (**bold**) where appropriate, based on the original formatting
5. Preserve all original content and meaning
6. Do not add any extra punctuation or modify the existing punctuation
7. Remove any spuriously inserted introductory text such as "Here is the corrected text:" that may have been added by the LLM and which is obviously not part of the original text.
8. Remove any obviously duplicated content that appears to have been accidentally included twice. Follow these strict guidelines:
- Remove only exact or near-exact repeated paragraphs or sections within the main chunk.
- Consider the context (before and after the main chunk) to identify duplicates that span chunk boundaries.
- Do not remove content that is simply similar but conveys different information.
- Preserve all unique content, even if it seems redundant.
- Ensure the text flows smoothly after removal.
- Do not add any new content or explanations.
- If no obvious duplicates are found, return the main chunk unchanged.
9. {"Identify but do not remove headers, footers, or page numbers. Instead, format them distinctly, e.g., as blockquotes." if not suppress_headers_and_page_numbers else "Carefully remove headers, footers, and page numbers while preserving all other content."}
Text to reformat:
{ocr_corrected_chunk}
Reformatted markdown:
"""
processed_chunk = await generate_completion(markdown_prompt, max_tokens=len(ocr_corrected_chunk) + 500)
new_context = processed_chunk[-1000:] # Use the last 1000 characters as context for the next chunk
logging.info(f"Chunk {chunk_index + 1}/{total_chunks} processed. Output length: {len(processed_chunk):,} characters")
return processed_chunk, new_context
https://github.com/Dicklesworthstone/llm_aided_ocr/blob/main/llm_aided_ocr.py