Skip to content

Troubleshoot the Bus

When instruments stop responding, return unexpected data, or the bus hangs, mcgpib provides several diagnostic tools to identify and recover from the problem. This guide covers the most common failure modes and how to resolve them.

flowchart TD
    START["check_srq"]
    ASSERTED{"SRQ<br/>asserted?"}
    FIND["find_rqs<br/>(identify source)"]
    FOUND{"Device<br/>found?"}
    SPOLL["serial_poll<br/>(read status byte)"]
    INTERPRET["Interpret status bits<br/>RQS / ESB / MAV"]
    MANUAL["serial_poll all<br/>known addresses"]
    DONE(["Handle condition"])
    CLEAR(["No service<br/>needed"])

    START --> ASSERTED
    ASSERTED -- "Yes" --> FIND
    ASSERTED -- "No" --> CLEAR
    FIND --> FOUND
    FOUND -- "Yes" --> SPOLL
    FOUND -- "No" --> MANUAL
    SPOLL --> INTERPRET
    MANUAL --> INTERPRET
    INTERPRET --> DONE

    style START fill:#78350f,stroke:#d97706,color:#fde68a
    style ASSERTED fill:#78350f,stroke:#d97706,color:#fde68a
    style FOUND fill:#78350f,stroke:#d97706,color:#fde68a
    style DONE fill:#166534,stroke:#22c55e,color:#dcfce7
    style CLEAR fill:#334155,stroke:#94a3b8,color:#e2e8f0

GPIB instruments signal that they need attention by asserting the SRQ (Service Request) line. This is a shared bus signal — any instrument can assert it.

> Check if any instrument is requesting service on bench-a

The LLM calls check_srq("bench-a"). If SRQ is asserted, mcgpib automatically calls find_rqs to identify which device raised it:

SRQ ASSERTED on bench-a — device at address 5 requesting service (status=0x50)

If find_rqs cannot pinpoint the source:

SRQ ASSERTED on bench-a — use serial_poll to identify the source

Serial polling reads the status byte from an instrument without disrupting its operation. This is the primary way to determine what an instrument needs.

> Serial poll address 5 on bench-a

The LLM calls serial_poll("bench-a", 5):

Serial poll on bench-a:
Address 5: status=0x50 (80) [SRQ]
> Serial poll everything on bench-a

The LLM calls serial_poll("bench-a") with no address, which polls all listeners discovered in the last scan:

Serial poll on bench-a:
Address 1: status=0x00 (0)
Address 5: status=0x50 (80) [SRQ]
Address 22: status=0x00 (0)

The status byte is an 8-bit value where each bit has a defined meaning per IEEE 488.2:

BitMaskMeaning
70x80Instrument-specific
60x40RQS — device is requesting service (SRQ source)
50x20ESB — event status bit (check *ESR? for details)
40x10MAV — message available in output buffer
3—00x0FInstrument-specific

A status byte of 0x50 means bits 6 and 4 are set: the instrument is requesting service (RQS) and has a message available (MAV).

flowchart TD
    START["Instrument not<br/>responding"]
    SCAN{"bus_scan —<br/>address found?"}
    HW["Check cables,<br/>power, address"]
    ERR["Query SYST:ERR?"]
    ERRQ{"Errors<br/>found?"}
    CLS["Send *CLS<br/>(clear status)"]
    RST["Send *RST<br/>(reset)"]
    RETEST{"Responding<br/>now?"}
    OK(["Recovered"])
    ESCALATE(["Hardware issue —<br/>check physical layer"])

    START --> SCAN
    SCAN -- "No" --> HW
    SCAN -- "Yes" --> ERR
    HW --> ESCALATE
    ERR --> ERRQ
    ERRQ -- "Yes" --> CLS
    ERRQ -- "No" --> RST
    CLS --> RETEST
    RST --> RETEST
    RETEST -- "Yes" --> OK
    RETEST -- "No" --> ESCALATE

    style START fill:#7f1d1d,stroke:#ef4444,color:#fecaca
    style OK fill:#166534,stroke:#22c55e,color:#dcfce7
    style ESCALATE fill:#7f1d1d,stroke:#ef4444,color:#fecaca
    style SCAN fill:#78350f,stroke:#d97706,color:#fde68a
    style ERRQ fill:#78350f,stroke:#d97706,color:#fde68a
    style RETEST fill:#78350f,stroke:#d97706,color:#fde68a
  1. Verify the instrument is on the bus

    > Scan bench-a without identification

    If the address does not appear, the problem is physical: check the GPIB cable, make sure the instrument is powered on, and verify its GPIB address setting.

  2. Check the instrument’s error queue

    > Query SYST:ERR? on address 22 on bench-a

    Read errors until you get 0,"No error". Some instruments lock up when their error queue is full.

  3. Clear the instrument’s status

    > Send *CLS to address 22 on bench-a

    The LLM calls instrument_write("bench-a", 22, "*CLS"). This clears status registers and error queues.

  4. Reset the instrument

    > Reset address 22 on bench-a

    The LLM calls instrument_reset("bench-a", 22), sending *RST. Wait several seconds for the instrument to reconfigure.

A timeout means the instrument did not respond within the configured read_timeout_ms. Common causes:

CauseSolution
Instrument is busy with a long measurementIncrease read_timeout_ms before the query
Wrong command — instrument has nothing to sendUse instrument_write instead of instrument_query
Instrument is in local modeSend instrument_remote to put it back in remote mode
Bus contention from another controllerSend interface_clear to reassert control
ESP32 brownout during rapid commandsIncrease inter_command_delay_ms in the config

When the GPIB bus stops responding to any commands:

  1. Assert Interface Clear

    > Send interface clear on bench-a

    The LLM calls interface_clear("bench-a"). This pulses the IFC line for 150 microseconds, forcing all devices to release the bus and making this bridge the Controller-In-Charge.

  2. Send a universal device clear

    > Send bus clear to all devices on bench-a

    The LLM calls bus_clear("bench-a") with no address. This sends DCL (Device Clear) to every device on the bus, which typically aborts any pending operation and clears input/output buffers.

  3. Re-scan to verify recovery

    > Scan bench-a

    If instruments reappear, the bus is recovered. If not, the problem is likely hardware (cable, connector, or a misbehaving instrument pulling bus lines low).

These two tools address different layers of the problem:

ToolWhat it doesWhen to use
bus_clear (with address)Sends Selected Device Clear (SDC) to one instrumentAn instrument is stuck or has a full error queue
bus_clear (no address)Sends Universal Device Clear (DCL) to all instrumentsMultiple instruments are unresponsive
interface_clearPulses IFC line — resets bus interfaces, not instrument stateBus is completely hung, no commands get through
flowchart LR
    subgraph interface_clear["interface_clear (IFC)"]
        direction TB
        IFC_A["Resets handshake lines"]
        IFC_B["Clears talker/listener<br/>assignments"]
        IFC_C["Reasserts Controller-<br/>In-Charge"]
    end

    subgraph bus_clear["bus_clear (DCL / SDC)"]
        direction TB
        BC_A["Clears instrument<br/>input/output buffers"]
        BC_B["Aborts pending<br/>operations"]
        BC_C["Resets instrument<br/>parser state"]
    end

    IFC_SCOPE["Bus interface<br/>layer"]
    BC_SCOPE["Instrument<br/>state"]

    interface_clear -.-> IFC_SCOPE
    bus_clear -.-> BC_SCOPE

    style IFC_SCOPE fill:#334155,stroke:#94a3b8,color:#e2e8f0
    style BC_SCOPE fill:#334155,stroke:#94a3b8,color:#e2e8f0
    style interface_clear fill:#1e293b,stroke:#d97706,color:#fde68a
    style bus_clear fill:#1e293b,stroke:#d97706,color:#fde68a

For systematic diagnosis, use the built-in prompt:

> Use the troubleshoot_instrument prompt for address 5 on bench-a

This guides the LLM through a six-step diagnostic sequence:

  1. Verify bus connectivity with a scan
  2. Check instrument identity with *IDN?
  3. Read status byte with serial poll
  4. Clear error state with *CLS and SYST:ERR?
  5. Reset the instrument with *RST
  6. Bus-level recovery with interface_clear and bus_clear

The prompt exits at the first step that resolves the problem, so minor issues are fixed quickly.

For suspected cable or connector problems, use bus_diagnostic to write known bit patterns directly to the bus lines:

> Run bus diagnostic on bench-a, write 0xAA to the data bus

The LLM calls bus_diagnostic("bench-a", mode=0, value=0xAA):

Diagnostic: writing 0xAA to data bus on bench-a for 10 seconds

This holds the pattern on the bus for 10 seconds, giving you time to probe individual lines with an oscilloscope or logic analyzer. Mode 0 writes to the data bus (DIO1-DIO8), mode 1 writes to the control bus (ATN, IFC, REN, SRQ, EOI).

The ESP32’s brownout detector is disabled in the AR488 firmware because GPIB bus activity causes transient voltage dips. However, if the ESP32 is powered from a USB port with marginal current capacity, you may see:

  • Bridge disconnects mid-sequence
  • Corrupted responses (partial or garbled data)
  • The bridge stops responding and requires a power cycle

Mitigations:

  • Use a powered USB hub or a USB port with reliable 500mA+ capacity
  • Increase inter_command_delay_ms to 20—50 ms to reduce peak current draw
  • For WiFi bridges, use a dedicated 5V power supply rather than USB power
  • Keep GPIB cable runs short (under 4 meters per segment, under 20 meters total)