After restarting the meta node the StreamingClient
is unable to connect. Node restarts in line 495 (see tmp.log
file below).
This node currently is a leader. Serving at 192.168.1.1:5690
The node is still serving on the same address as before. Retrieving a StreamClient
results in a Transport error: transport error
. Error is caused by below code in the observer_manager.rs
/// `re_subscribe` is used to re-subscribe to the meta's notification.
async fn re_subscribe(&mut self)
Entire log in tmp.log
. I added a few extra logs in the log in the tmp.log
.
backtrace of `MetaError`:
0: std::backtrace_rs::backtrace::libunwind::trace
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
1: std::backtrace_rs::backtrace::trace_unsynchronized
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: std::backtrace::Backtrace::create
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/std/src/backtrace.rs:333:13
3: <risingwave_meta::error::MetaError as core::convert::From<risingwave_meta::error::MetaErrorInner>>::from
at ./src/meta/src/error.rs:66:33
4: <T as core::convert::Into<U>>::into
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/core/src/convert/mod.rs:726:9
5: <risingwave_meta::error::MetaError as core::convert::From<risingwave_rpc_client::error::RpcError>>::from
at ./src/meta/src/error.rs:135:9
6: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/core/src/result.rs:2108:27
7: risingwave_meta::barrier::recovery::<impl risingwave_meta::barrier::GlobalBarrierManager<S>>::reset_compute_nodes::{{closure}}
at ./src/meta/src/barrier/recovery.rs:346:9
8: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/core/src/future/mod.rs:91:19
9: risingwave_meta::barrier::recovery::<impl risingwave_meta::barrier::GlobalBarrierManager<S>>::recovery::{{closure}}::{{closure}}::{{closure}}
at ./src/meta/src/barrier/recovery.rs:138:44
10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/b8c35ca26b191bb9a9ac669a4b3f4d3d52d97fb1/library/core/src/future/mod.rs:91:19
...
7bdef7c53f3516b0d8d1c35721b9dabce821b369
on branch jm/multi-meta-etcd-grafana
MADSIM_CONFIG_HASH=C2F7CAF3EA64B636 MADSIM_TEST_SEED=167077265649007 RUST_BACKTRACE=1 RUST_LOG=info risedev sslt -- --kill-rate=0.1 --kill "e2e_test/**/**/*.slt" > tmp.log 2>&1 ; code tmp.log
No response
Thank you very much for the hint. The error message definitely changed, but we still run into a panic. I will keep this bug open for now. Not sure if it is the same issue, though