Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better error message for kernels indirectly calling kernels through RPCs #654

Closed
vontell opened this issue Jan 12, 2017 · 8 comments
Closed

Comments

@vontell
Copy link

vontell commented Jan 12, 2017

I am currently having an issue with my experiment which detects a rising edges and outputs a pulse after a certain number of rising edges.

I get the following error (the full stack trace is at the bottom of this issue):
OSError: Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)

The experiment is as follows:

from artiq.experiment import *
from rle.pipistrello import Board

class PipistrelloTest(EnvExperiment):

	def build(self):
		
		# Initialize the board
		self.board = Board(self)

	@kernel
	def run(self):
		
		self.board.reset()
		
		# flash the board to confirm connection
		self.board.led_test()
        
        # Find the latency of this board
		# (Minimum delay needed between instructions)
		# TODO Why does 64 not work? Seems to be a buffer issue...
		print("Finding board latency...")
		latency = self.board.find_latency(4 * us, 63, 50, 2)
		print("Found latency: ", latency, " sec")
		
		# Start the pulse
		self.board.get_core().break_realtime()
		
		# Delaying to make sure we don't get initial RTIO Underflow Errors
		print("Before delay of 2 seconds: ", now_mu())
		delay(2*s)
		START = now_mu()
		print("Start time: ", START)
		print("Starting pulse")
		# pulse ttl = 0 with a square wave of T = 4 us, and 20 oscillations
		self.board.pulse(0, 4 * us, 20)
		print("Pulses placed. Done!")
        
		# Start listening for rising edge events
		print("Register rising edge event")
		
		# next_pulse will be given self.board and the timestamp
		# of the last rising edge. An additional delay of `latency`
		# will also be inserted before calling next_pulse
		self.board.register_rising(0, next_pulse, START, threshold=5)

@kernel			
def next_pulse(board, start):
	
	print("Starting new pulse")
	
	at_mu(start)
	delay(1*s)
	
	# THIS IS WHERE THE ERROR OCCURS
	board.pulseDC(1, 1*s)
	
	print("Finished new pulse placement")

The methods of self.board can be found here. In terms of the relevant methods, self.board.pulse() creates an oscillatory pulse on a given TTL channel, with the given period and number of oscillations to make. self.board.register_rising() listens for rising edges on a given PMT input, until the threshold is released, upon which next_pulse is executed after a delay of latency, calculated from the find_latency() method.

Overall, the expected behavior of this code is to output 20 rising edges on TTL 0 at 4 microseconds apart, listen for those pulses on PMT 0, and after seeing 5 rising edges, take the timestamp of rising edge 6, delay by the found latency, call next_pulse, delay by another second, and pulse TTL 1 for 1 second. However, I get the following output and stack trace, failing at board.pulseDC(1, 1*s), which is equivalent to ttl1.pulse(1*s):

Finding board latency...
Found latency:  8.000000000002674e-09  sec
Before delay of 2 seconds:  1199160104175
Start time:  1201160104174
Starting pulse
Pulse starts at  1201160104174
Pulse end at  1201160184134
Pulses placed. Done!
Register rising edge event
Rising edge at  1201160104222
Rising edge at  1201160108220
Rising edge at  1201160112218
Rising edge at  1201160116216
Rising edge at  1201160120214
Rising edge at  1201160124212
Start handler
Starting new pulse
Core Device Traceback (most recent call last):
  File "repository/fast-test.py", line 49, in artiq_run_fast-test.PipistrelloTest.run(..., ...) (RA=+0x19bc)
    self.board.register_rising(0, next_pulse, START, threshold=5)
  File "/home/aaronv/Documents/artiq-control/samples/repository/rle/pipistrello.py", line 205, in ... rle.pipistrello.Board.register_rising<rle.pipistrello.Board>(...) (RA=+0x1f00)
    handler(self, now_mu())
  File "repository/fast-test.py", line 60, in __modinit__ (RA=+0x2080)
    board.pulseDC(1, 1*s)
  File "/home/aaronv/Documents/artiq-control/samples/repository/rle/pipistrello.py", line 173, in pulseDC
    self.ttls[ttl].pulse(length)
builtins.OSError(34): Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)
Traceback (most recent call last):
  File "/home/aaronv/miniconda3/envs/artiq-main/bin/artiq_run", line 11, in <module>
    load_entry_point('artiq==2.1', 'console_scripts', 'artiq_run')()
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 213, in main
    return run(with_file=True)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 199, in run
    raise exn
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/frontend/artiq_run.py", line 192, in run
    exp_inst.run()
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/language/core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/core.py", line 122, in run
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/comm_generic.py", line 539, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "/home/aaronv/miniconda3/envs/artiq-main/lib/python3.5/site-packages/artiq/coredevice/comm_generic.py", line 531, in _serve_exception
    raise python_exn
OSError: Incorrect reply from device: _D2HMsgType.LOAD_FAILED (expected _D2HMsgType.LOAD_COMPLETED)

Any advice or tips would be greatly appreciated!

@whitequark
Copy link
Contributor

What's the output of artiq_corelog ?

@vontell
Copy link
Author

vontell commented Jan 12, 2017

artiq_corelog outputs:

Startup RTIO clock: internal
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running
Attempted to load new kernel library while already running

Sorry, something went wrong.

@whitequark
Copy link
Contributor

Ah, right. So what you're trying to do is:

  • from host Python code, call kernel Python code (this works, via a compilation);
  • from kernel Python code, call host Python code №2 (this works, via an RPC);
  • from host Python code №2, call kernel Python code again (this is not supported).

You should add a @kernel annotation here: https://github.com/vontell/artiq-control/blob/master/samples/repository/rle/pipistrello.py#L169

Sorry, something went wrong.

@vontell
Copy link
Author

vontell commented Jan 12, 2017

Ah of course; thank you!

Sorry, something went wrong.

@vontell vontell closed this as completed Jan 12, 2017
@jordens
Copy link
Member

jordens commented Jan 13, 2017

@whitequark Could we catch this type of invalid indirect calls of kernels through RPCs early so that Aaron would have known what the problem is?

@jordens jordens reopened this Jan 13, 2017
@whitequark
Copy link
Contributor

@jordens Not sure about "early" but we could definitely return a cause with the LOAD_FAILED message, which would address this as well.

@jordens
Copy link
Member

jordens commented Jan 24, 2017

It seems to me that we could just mark the coredevice object on the host as "in-use" so that when an RPC comes in from a kernel, the worker handling the RPC can tell straight away that any kernels called (through this reentrant code path) while the coredevice is active are disallowed.

@jordens jordens changed the title LOAD_FAILED message on TTL pulse better error message for kernels indirectly calling kernels through RPCs Jan 24, 2017
@whitequark
Copy link
Contributor

It seems better to do this in the RPC protocol because this catches more errors, and simplifies bug reporting for any cause of load failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants