better error message for kernels indirectly calling kernels through RPCs #654

vontell · 2017-01-12T03:42:31Z

I am currently having an issue with my experiment which detects a rising edges and outputs a pulse after a certain number of rising edges.

I get the following error (the full stack trace is at the bottom of this issue):
OSError: Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)

The experiment is as follows:

from artiq.experiment import *
from rle.pipistrello import Board

class PipistrelloTest(EnvExperiment):

	def build(self):
		
		# Initialize the board
		self.board = Board(self)

	@kernel
	def run(self):
		
		self.board.reset()
		
		# flash the board to confirm connection
		self.board.led_test()
        
        # Find the latency of this board
		# (Minimum delay needed between instructions)
		# TODO Why does 64 not work? Seems to be a buffer issue...
		print("Finding board latency...")
		latency = self.board.find_latency(4 * us, 63, 50, 2)
		print("Found latency: ", latency, " sec")
		
		# Start the pulse
		self.board.get_core().break_realtime()
		
		# Delaying to make sure we don't get initial RTIO Underflow Errors
		print("Before delay of 2 seconds: ", now_mu())
		delay(2*s)
		START = now_mu()
		print("Start time: ", START)
		print("Starting pulse")
		# pulse ttl = 0 with a square wave of T = 4 us, and 20 oscillations
		self.board.pulse(0, 4 * us, 20)
		print("Pulses placed. Done!")
        
		# Start listening for rising edge events
		print("Register rising edge event")
		
		# next_pulse will be given self.board and the timestamp
		# of the last rising edge. An additional delay of `latency`
		# will also be inserted before calling next_pulse
		self.board.register_rising(0, next_pulse, START, threshold=5)

@kernel			
def next_pulse(board, start):
	
	print("Starting new pulse")
	
	at_mu(start)
	delay(1*s)
	
	# THIS IS WHERE THE ERROR OCCURS
	board.pulseDC(1, 1*s)
	
	print("Finished new pulse placement")

The methods of self.board can be found here. In terms of the relevant methods, self.board.pulse() creates an oscillatory pulse on a given TTL channel, with the given period and number of oscillations to make. self.board.register_rising() listens for rising edges on a given PMT input, until the threshold is released, upon which next_pulse is executed after a delay of latency, calculated from the find_latency() method.

Overall, the expected behavior of this code is to output 20 rising edges on TTL 0 at 4 microseconds apart, listen for those pulses on PMT 0, and after seeing 5 rising edges, take the timestamp of rising edge 6, delay by the found latency, call next_pulse, delay by another second, and pulse TTL 1 for 1 second. However, I get the following output and stack trace, failing at board.pulseDC(1, 1*s), which is equivalent to ttl1.pulse(1*s):

Finding board latency...
Found latency:  8.000000000002674e-09  sec
Before delay of 2 seconds:  1199160104175
Start time:  1201160104174
Starting pulse
Pulse starts at  1201160104174
Pulse end at  1201160184134
Pulses placed. Done!
Register rising edge event
Rising edge at  1201160104222
Rising edge at  1201160108220
Rising edge at  1201160112218
Rising edge at  1201160116216
Rising edge at  1201160120214
Rising edge at  1201160124212
Start handler
Starting new pulse
Core Device Traceback (most recent call last):
  File "repository/fast-test.py", line 49, in artiq_run_fast-test.PipistrelloTest.run(..., ...) (RA=+0x19bc)
    self.board.register_rising(0, next_pulse, START, threshold=5)
  File "/home/aaronv/Documents/artiq-control/samples/repository/rle/pipistrello.py", line 205, in ... rle.pipistrello.Board.register_rising<rle.pipistrello.Board>(...) (RA=+0x1f00)
    handler(self, now_mu())
  File "repository/fast-test.py", line 60, in __modinit__ (RA=+0x2080)
    board.pulseDC(1, 1*s)
  File "/home/aaronv/Documents/artiq-control/samples/repository/rle/pipistrello.py", line 173, in pulseDC
    self.ttls[ttl].pulse(length)
builtins.OSError(34): Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)
Traceback (most recent call last):
  File "/home/aaronv/miniconda3/envs/artiq-main/bin/artiq_run", line 11, in <module>
    load_entry_point('artiq==2.1', 'console_scripts', 'artiq_run')()
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 213, in main
    return run(with_file=True)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 199, in run
    raise exn
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 192, in run
    exp_inst.run()
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/language/core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/core.py", line 122, in run
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/comm_generic.py", line 539, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/comm_generic.py", line 531, in _serve_exception
    raise python_exn
OSError: Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)

Any advice or tips would be greatly appreciated!

The text was updated successfully, but these errors were encountered:

whitequark · 2017-01-12T03:52:14Z

What's the output of artiq_corelog ?

vontell · 2017-01-12T03:56:07Z

artiq_corelog outputs:

Startup RTIO clock: internal
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running

whitequark · 2017-01-12T03:59:16Z

Ah, right. So what you're trying to do is:

from host Python code, call kernel Python code (this works, via a compilation);
from kernel Python code, call host Python code №2 (this works, via an RPC);
from host Python code №2, call kernel Python code again (this is not supported).

You should add a @kernel annotation here: https://github.com/vontell/artiq-control/blob/master/samples/repository/rle/pipistrello.py#L169

vontell · 2017-01-12T04:22:50Z

Ah of course; thank you!

jordens · 2017-01-13T11:02:32Z

@whitequark Could we catch this type of invalid indirect calls of kernels through RPCs early so that Aaron would have known what the problem is?

whitequark · 2017-01-13T17:25:53Z

@jordens Not sure about "early" but we could definitely return a cause with the LOAD_FAILED message, which would address this as well.

jordens · 2017-01-24T20:58:04Z

It seems to me that we could just mark the coredevice object on the host as "in-use" so that when an RPC comes in from a kernel, the worker handling the RPC can tell straight away that any kernels called (through this reentrant code path) while the coredevice is active are disallowed.

whitequark · 2017-01-24T21:01:36Z

It seems better to do this in the RPC protocol because this catches more errors, and simplifies bug reporting for any cause of load failure.

vontell closed this as completed Jan 12, 2017

jordens reopened this Jan 13, 2017

whitequark self-assigned this Jan 24, 2017

whitequark added area:coredevice type:feature labels Jan 24, 2017

jordens changed the title ~~LOAD_FAILED message on TTL pulse~~ better error message for kernels indirectly calling kernels through RPCs Jan 24, 2017

whitequark closed this as completed in 74b910e Jan 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

better error message for kernels indirectly calling kernels through RPCs #654

better error message for kernels indirectly calling kernels through RPCs #654

vontell commented Jan 12, 2017

whitequark commented Jan 12, 2017

vontell commented Jan 12, 2017

whitequark commented Jan 12, 2017

vontell commented Jan 12, 2017

jordens commented Jan 13, 2017

whitequark commented Jan 13, 2017

jordens commented Jan 24, 2017

whitequark commented Jan 24, 2017

better error message for kernels indirectly calling kernels through RPCs #654

better error message for kernels indirectly calling kernels through RPCs #654

Comments

vontell commented Jan 12, 2017

whitequark commented Jan 12, 2017

vontell commented Jan 12, 2017

whitequark commented Jan 12, 2017

vontell commented Jan 12, 2017

jordens commented Jan 13, 2017

whitequark commented Jan 13, 2017

jordens commented Jan 24, 2017

whitequark commented Jan 24, 2017