Skip to content

[<Ray component: Core] Fatal Python error: Segmentation fault #33285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
502122559 opened this issue Mar 14, 2023 · 4 comments
Closed

[<Ray component: Core] Fatal Python error: Segmentation fault #33285

502122559 opened this issue Mar 14, 2023 · 4 comments
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission.

Comments

@502122559
Copy link

502122559 commented Mar 14, 2023

What happened + What you expected to happen

I got the following error in the process of using ray, it seems to be caused by the pull strategy of grpcio, can you give me some help?This is an occasional problem, I can't give the code to reproduce

Versions / Dependencies

ray==2.3.0
grpcio==1.32.0
python==3.8.13

Reproduction script

*** SIGSEGV received at time=1678337610 on cpu 12 ***
UserWarning: ks_thres is not input, has set the default value 0.01.
*** SIGSEGV received at time=1678337610 on cpu 10 ***
PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
@ 0x7fe053202090 (unknown) (unknown)
[2023-03-09 12:53:30,446 E 957 30467] logging.cc:361: *** SIGSEGV received at time=1678337610 on cpu 12 ***
[2023-03-09 12:53:30,446 E 957 30467] logging.cc:361: PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
[2023-03-09 12:53:30,446 E 957 30467] logging.cc:361: @ 0x7fe053202090 (unknown) (unknown)
Fatal Python error: Segmentation fault

Stack (most recent call first):
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 813 in _blocking
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 824 in call
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 352 in internal_kv_exists
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 198 in wrapper
File "/usr/local/lib/python3.8/site-packages/ray/_private/function_manager.py", line 213 in export
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 282 in _remote
File "/usr/local/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 307 in _invocation_remote_span
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 129 in _remote_proxy
File "flex/crypto/paillier/encryptor.py", line 156 in
File "flex/crypto/paillier/encryptor.py", line 156 in encrypt_ray
File "flex/crypto/paillier/encryptor.py", line 120 in _encrypt_numpy
File "flex/crypto/paillier/encryptor.py", line 140 in encrypt
File "flex/cores/distributed_ray.py", line 104 in pailler_encrypt_ray
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108 in run
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315 in _bootstrap
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 75 in _launch
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 19 in init
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 277 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 224 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 121 in start
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 589 in train_task
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 463 in submit_task
File "/usr/local/lib/python3.8/site-packages/ray/_private/worker.py", line 772 in main_loop
File "/usr/local/lib/python3.8/site-packages/ray/_private/workers/default_worker.py", line 226 in
PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
@ 0x7fe053202090 (unknown) (unknown)
[2023-03-09 12:53:30,449 E 958 30467] logging.cc:361: *** SIGSEGV received at time=1678337610 on cpu 10 ***
[2023-03-09 12:53:30,450 E 958 30467] logging.cc:361: PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
[2023-03-09 12:53:30,450 E 958 30467] logging.cc:361: @ 0x7fe053202090 (unknown) (unknown)
Fatal Python error: Segmentation fault

Stack (most recent call first):
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 813 in _blocking
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 824 in call
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 352 in internal_kv_exists
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 198 in wrapper
File "/usr/local/lib/python3.8/site-packages/ray/_private/function_manager.py", line 213 in export
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 282 in _remote
File "/usr/local/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 307 in _invocation_remote_span
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 129 in _remote_proxy
File "flex/crypto/paillier/encryptor.py", line 156 in
File "flex/crypto/paillier/encryptor.py", line 156 in encrypt_ray
File "flex/crypto/paillier/encryptor.py", line 120 in _encrypt_numpy
File "flex/crypto/paillier/encryptor.py", line 140 in encrypt
File "flex/cores/distributed_ray.py", line 104 in pailler_encrypt_ray
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108 in run
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315 in _bootstrap
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 75 in _launch
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 19 in init
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 277 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 224 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 121 in start
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 589 in train_task
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 463 in submit_task
File "/usr/local/lib/python3.8/site-packages/ray/_private/worker.py", line 772 in main_loop
File "/usr/local/lib/python3.8/site-packages/ray/_private/workers/default_worker.py", line 226 in
UserWarning: ks_thres is not input, has set the default value 0.01.
*** SIGSEGV received at time=1678337610 on cpu 30 ***
PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
@ 0x7fe053202090 (unknown) (unknown)
[2023-03-09 12:53:30,471 E 960 30467] logging.cc:361: *** SIGSEGV received at time=1678337610 on cpu 30 ***
[2023-03-09 12:53:30,471 E 960 30467] logging.cc:361: PC: @ 0x7fe04ff8d46a (unknown) pollset_work()
[2023-03-09 12:53:30,471 E 960 30467] logging.cc:361: @ 0x7fe053202090 (unknown) (unknown)
Fatal Python error: Segmentation fault

Stack (most recent call first):
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 813 in _blocking
File "/usr/local/lib/python3.8/site-packages/grpc/_channel.py", line 824 in call
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 352 in internal_kv_exists
File "/usr/local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 198 in wrapper
File "/usr/local/lib/python3.8/site-packages/ray/_private/function_manager.py", line 213 in export
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 282 in _remote
File "/usr/local/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 307 in _invocation_remote_span
File "/usr/local/lib/python3.8/site-packages/ray/remote_function.py", line 129 in _remote_proxy
File "flex/crypto/paillier/encryptor.py", line 156 in
File "flex/crypto/paillier/encryptor.py", line 156 in encrypt_ray
File "flex/crypto/paillier/encryptor.py", line 120 in _encrypt_numpy
File "flex/crypto/paillier/encryptor.py", line 140 in encrypt
File "flex/cores/distributed_ray.py", line 104 in pailler_encrypt_ray
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108 in run
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315 in _bootstrap
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 75 in _launch
File "/usr/local/lib/python3.8/multiprocessing/popen_fork.py", line 19 in init
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 277 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/context.py", line 224 in _Popen
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 121 in start
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 589 in train_task
File "/usr/local/lib/python3.8/site-packages/safe/scheduler/scheduler.py", line 463 in submit_task
File "/usr/local/lib/python3.8/site-packages/ray/_private/worker.py", line 772 in main_loop
File "/usr/local/lib/python3.8/site-packages/ray/_private/workers/default_worker.py", line 226 in

Issue Severity

Medium: It is a significant difficulty but I can work around it.

@502122559 502122559 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 14, 2023
@502122559 502122559 changed the title [<Ray component: Core] [<Ray component: Core] Fatal Python error: Segmentation fault Mar 14, 2023
@rkooo567 rkooo567 added core Issues that should be addressed in Ray Core triage Needs triage (eg: priority, bug/not-bug, and owning component) and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 14, 2023
@rkooo567
Copy link
Contributor

what's your platform? (mac os or linux?)

@rkooo567
Copy link
Contributor

do you mind trying grpc 1.49.1; and see if this still happens?

@scv119 scv119 added needs-repro-script Issue needs a runnable script to be reproduced @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission. and removed needs-repro-script Issue needs a runnable script to be reproduced triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 15, 2023
@scv119
Copy link
Contributor

scv119 commented Mar 16, 2023

most likely grpc/grpc#23796
as @rkooo567 suggested, let's know if upgrade grpc solves the problem.

@502122559
Copy link
Author

502122559 commented Mar 27, 2023

@rkooo567 @scv119 I did not upgrade the version, I set three environment variables, and this exception has not been reproduced so far.
ray.init(address=ray_address, runtime_env={'env_vars': { 'GRPC_ENABLE_FORK_SUPPORT': 'True', 'GRPC_POLL_STRATEGY': 'epoll1', 'RAY_start_python_importer_thread': '0', }})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core @external-author-action-required Alternate tag for PRs where the author doesn't have labeling permission.
Projects
None yet
Development

No branches or pull requests

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy