Troubleshoot the Bus
When instruments stop responding, return unexpected data, or the bus hangs, mcgpib provides several diagnostic tools to identify and recover from the problem. This guide covers the most common failure modes and how to resolve them.
flowchart TD
START["check_srq"]
ASSERTED{"SRQ<br/>asserted?"}
FIND["find_rqs<br/>(identify source)"]
FOUND{"Device<br/>found?"}
SPOLL["serial_poll<br/>(read status byte)"]
INTERPRET["Interpret status bits<br/>RQS / ESB / MAV"]
MANUAL["serial_poll all<br/>known addresses"]
DONE(["Handle condition"])
CLEAR(["No service<br/>needed"])
START --> ASSERTED
ASSERTED -- "Yes" --> FIND
ASSERTED -- "No" --> CLEAR
FIND --> FOUND
FOUND -- "Yes" --> SPOLL
FOUND -- "No" --> MANUAL
SPOLL --> INTERPRET
MANUAL --> INTERPRET
INTERPRET --> DONE
style START fill:#78350f,stroke:#d97706,color:#fde68a
style ASSERTED fill:#78350f,stroke:#d97706,color:#fde68a
style FOUND fill:#78350f,stroke:#d97706,color:#fde68a
style DONE fill:#166534,stroke:#22c55e,color:#dcfce7
style CLEAR fill:#334155,stroke:#94a3b8,color:#e2e8f0
Check for service requests
Section titled “Check for service requests”GPIB instruments signal that they need attention by asserting the SRQ (Service Request) line. This is a shared bus signal — any instrument can assert it.
> Check if any instrument is requesting service on bench-aThe LLM calls check_srq("bench-a"). If SRQ is asserted, mcgpib automatically calls find_rqs to identify which device raised it:
SRQ ASSERTED on bench-a — device at address 5 requesting service (status=0x50)If find_rqs cannot pinpoint the source:
SRQ ASSERTED on bench-a — use serial_poll to identify the sourceSerial poll for status
Section titled “Serial poll for status”Serial polling reads the status byte from an instrument without disrupting its operation. This is the primary way to determine what an instrument needs.
Poll a specific instrument
Section titled “Poll a specific instrument”> Serial poll address 5 on bench-aThe LLM calls serial_poll("bench-a", 5):
Serial poll on bench-a: Address 5: status=0x50 (80) [SRQ]Poll all known instruments
Section titled “Poll all known instruments”> Serial poll everything on bench-aThe LLM calls serial_poll("bench-a") with no address, which polls all listeners discovered in the last scan:
Serial poll on bench-a: Address 1: status=0x00 (0) Address 5: status=0x50 (80) [SRQ] Address 22: status=0x00 (0)Interpret the status byte
Section titled “Interpret the status byte”The status byte is an 8-bit value where each bit has a defined meaning per IEEE 488.2:
| Bit | Mask | Meaning |
|---|---|---|
| 7 | 0x80 | Instrument-specific |
| 6 | 0x40 | RQS — device is requesting service (SRQ source) |
| 5 | 0x20 | ESB — event status bit (check *ESR? for details) |
| 4 | 0x10 | MAV — message available in output buffer |
| 3—0 | 0x0F | Instrument-specific |
A status byte of 0x50 means bits 6 and 4 are set: the instrument is requesting service (RQS) and has a message available (MAV).
Recover from common failures
Section titled “Recover from common failures”Instrument not responding
Section titled “Instrument not responding”flowchart TD
START["Instrument not<br/>responding"]
SCAN{"bus_scan —<br/>address found?"}
HW["Check cables,<br/>power, address"]
ERR["Query SYST:ERR?"]
ERRQ{"Errors<br/>found?"}
CLS["Send *CLS<br/>(clear status)"]
RST["Send *RST<br/>(reset)"]
RETEST{"Responding<br/>now?"}
OK(["Recovered"])
ESCALATE(["Hardware issue —<br/>check physical layer"])
START --> SCAN
SCAN -- "No" --> HW
SCAN -- "Yes" --> ERR
HW --> ESCALATE
ERR --> ERRQ
ERRQ -- "Yes" --> CLS
ERRQ -- "No" --> RST
CLS --> RETEST
RST --> RETEST
RETEST -- "Yes" --> OK
RETEST -- "No" --> ESCALATE
style START fill:#7f1d1d,stroke:#ef4444,color:#fecaca
style OK fill:#166534,stroke:#22c55e,color:#dcfce7
style ESCALATE fill:#7f1d1d,stroke:#ef4444,color:#fecaca
style SCAN fill:#78350f,stroke:#d97706,color:#fde68a
style ERRQ fill:#78350f,stroke:#d97706,color:#fde68a
style RETEST fill:#78350f,stroke:#d97706,color:#fde68a
-
Verify the instrument is on the bus
> Scan bench-a without identificationIf the address does not appear, the problem is physical: check the GPIB cable, make sure the instrument is powered on, and verify its GPIB address setting.
-
Check the instrument’s error queue
> Query SYST:ERR? on address 22 on bench-aRead errors until you get
0,"No error". Some instruments lock up when their error queue is full. -
Clear the instrument’s status
> Send *CLS to address 22 on bench-aThe LLM calls
instrument_write("bench-a", 22, "*CLS"). This clears status registers and error queues. -
Reset the instrument
> Reset address 22 on bench-aThe LLM calls
instrument_reset("bench-a", 22), sending*RST. Wait several seconds for the instrument to reconfigure.
Read timeouts
Section titled “Read timeouts”A timeout means the instrument did not respond within the configured read_timeout_ms. Common causes:
| Cause | Solution |
|---|---|
| Instrument is busy with a long measurement | Increase read_timeout_ms before the query |
| Wrong command — instrument has nothing to send | Use instrument_write instead of instrument_query |
| Instrument is in local mode | Send instrument_remote to put it back in remote mode |
| Bus contention from another controller | Send interface_clear to reassert control |
| ESP32 brownout during rapid commands | Increase inter_command_delay_ms in the config |
Bus hangs
Section titled “Bus hangs”When the GPIB bus stops responding to any commands:
-
Assert Interface Clear
> Send interface clear on bench-aThe LLM calls
interface_clear("bench-a"). This pulses the IFC line for 150 microseconds, forcing all devices to release the bus and making this bridge the Controller-In-Charge. -
Send a universal device clear
> Send bus clear to all devices on bench-aThe LLM calls
bus_clear("bench-a")with no address. This sends DCL (Device Clear) to every device on the bus, which typically aborts any pending operation and clears input/output buffers. -
Re-scan to verify recovery
> Scan bench-aIf instruments reappear, the bus is recovered. If not, the problem is likely hardware (cable, connector, or a misbehaving instrument pulling bus lines low).
bus_clear vs interface_clear
Section titled “bus_clear vs interface_clear”These two tools address different layers of the problem:
| Tool | What it does | When to use |
|---|---|---|
bus_clear (with address) | Sends Selected Device Clear (SDC) to one instrument | An instrument is stuck or has a full error queue |
bus_clear (no address) | Sends Universal Device Clear (DCL) to all instruments | Multiple instruments are unresponsive |
interface_clear | Pulses IFC line — resets bus interfaces, not instrument state | Bus is completely hung, no commands get through |
flowchart LR
subgraph interface_clear["interface_clear (IFC)"]
direction TB
IFC_A["Resets handshake lines"]
IFC_B["Clears talker/listener<br/>assignments"]
IFC_C["Reasserts Controller-<br/>In-Charge"]
end
subgraph bus_clear["bus_clear (DCL / SDC)"]
direction TB
BC_A["Clears instrument<br/>input/output buffers"]
BC_B["Aborts pending<br/>operations"]
BC_C["Resets instrument<br/>parser state"]
end
IFC_SCOPE["Bus interface<br/>layer"]
BC_SCOPE["Instrument<br/>state"]
interface_clear -.-> IFC_SCOPE
bus_clear -.-> BC_SCOPE
style IFC_SCOPE fill:#334155,stroke:#94a3b8,color:#e2e8f0
style BC_SCOPE fill:#334155,stroke:#94a3b8,color:#e2e8f0
style interface_clear fill:#1e293b,stroke:#d97706,color:#fde68a
style bus_clear fill:#1e293b,stroke:#d97706,color:#fde68a
Use the troubleshoot_instrument prompt
Section titled “Use the troubleshoot_instrument prompt”For systematic diagnosis, use the built-in prompt:
> Use the troubleshoot_instrument prompt for address 5 on bench-aThis guides the LLM through a six-step diagnostic sequence:
- Verify bus connectivity with a scan
- Check instrument identity with
*IDN? - Read status byte with serial poll
- Clear error state with
*CLSandSYST:ERR? - Reset the instrument with
*RST - Bus-level recovery with
interface_clearandbus_clear
The prompt exits at the first step that resolves the problem, so minor issues are fixed quickly.
Hardware-level diagnostics
Section titled “Hardware-level diagnostics”For suspected cable or connector problems, use bus_diagnostic to write known bit patterns directly to the bus lines:
> Run bus diagnostic on bench-a, write 0xAA to the data busThe LLM calls bus_diagnostic("bench-a", mode=0, value=0xAA):
Diagnostic: writing 0xAA to data bus on bench-a for 10 secondsThis holds the pattern on the bus for 10 seconds, giving you time to probe individual lines with an oscilloscope or logic analyzer. Mode 0 writes to the data bus (DIO1-DIO8), mode 1 writes to the control bus (ATN, IFC, REN, SRQ, EOI).
ESP32 brownout issues
Section titled “ESP32 brownout issues”The ESP32’s brownout detector is disabled in the AR488 firmware because GPIB bus activity causes transient voltage dips. However, if the ESP32 is powered from a USB port with marginal current capacity, you may see:
- Bridge disconnects mid-sequence
- Corrupted responses (partial or garbled data)
- The bridge stops responding and requires a power cycle
Mitigations:
- Use a powered USB hub or a USB port with reliable 500mA+ capacity
- Increase
inter_command_delay_msto 20—50 ms to reduce peak current draw - For WiFi bridges, use a dedicated 5V power supply rather than USB power
- Keep GPIB cable runs short (under 4 meters per segment, under 20 meters total)