Home¶
Logflare AI is a SaaS or self-hosted system to automatically read logs, look at graphs and present suggestions and point in a direction in plain english where to start looking.
Logflare can consume your logs with a very short retention, or read directly from your VictoriaLogs, Elasicsearch or Loki logstore. Logflare does not store any of your logs unless it's nessacary.
The most secure implmentation will spin up a memory only model for our queries, to keep your data secure.
The problem: NOC and SOC operators does not always have domain knowledge of all applications running, but they need good tooling to let them quickly get an understanding of the issue at hand.
The solution: Transform the raw log data, into a textual analysis and provide with severity and suggested starting points for troubleshooting. This can also be combined with your own knowledge base as a training set, and your own source code as added context for the model if that option is chosen.
Will go from this:
{"timestamp": "2025-12-26 16:18:35,611", "name": "app", "level": "INFO", "message": "🎁E1226 15:03:48.035698 1 reflector.go:205] "Failed to watch" err="failed to list *v1.VolumeSnapshotClass: the server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)" logger="UnhandledError" reflector="github.com/kubernetes-csi/external-snapshotter/client/v8/informers/externalversions/factory.go:142" type="*v1.VolumeSnapshotClass"🎁
"
{"timestamp": "2025-12-26 16:18:35,611", "name": "app", "level": "INFO", "message": "🎁{"level":"info","ts":"2025-12-26T15:03:51.637Z","msg":"server-side apply completed","controller":"kustomization","controllerGroup":"kustomize.toolkit.fluxcd.io","controllerKind":"Kustomization","Kustomization":{"name":"flux-system","namespace":"flux-system"},"namespace":"flux-system","name":"flux-system","reconcileID":"74d65c8d-6c92-4951-90e9-4bc1a8ac7cb2","output":{"ClusterRole/crd-controller-flux-system":"unchanged","ClusterRole/flux-edit-flux-system":"unchanged","ClusterRole/flux-view-flux-system":"unchanged","ClusterRoleBinding/cluster-reconciler-flux-system":"unchanged","ClusterRoleBinding/crd-controller-flux-system":"unchanged","CustomResourceDefinition/alerts.notification.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/buckets.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/externalartifacts.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/gitrepositories.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/helmcharts.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/helmreleases.helm.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/helmrepositories.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/kustomizations.kustomize.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/ocirepositories.source.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/providers.notification.toolkit.fluxcd.io":"unchanged","CustomResourceDefinition/receivers.notification.toolkit.fluxcd.io":"unchanged","Deployment/flux-system/helm-controller":"unchanged","Deployment/flux-system/kustomize-controller":"unchanged","Deployment/flux-system/notification-controller":"unchanged","Deployment/flux-system/source-controller":"unchanged","GitRepository/flux-system/flux-system":"unchanged","Kustomization/flux-system/flux-system":"unchanged","Namespace/flux-system":"unchanged","NetworkPolicy/flux-system/allow-egress":"unchanged","NetworkPolicy/flux-system/allow-scraping":"unchanged","NetworkPolicy/flux-system/allow-webhooks":"unchanged","ResourceQuota/flux-system/critical-pods-flux-system":"unchanged","Service/flux-system/notification-controller":"unchanged","Service/flux-system/source-controller":"unchanged","Service/flux-system/webhook-receiver":"unchanged","ServiceAccount/flux-system/helm-controller":"unchanged","ServiceAccount/flux-system/kustomize-controller":"unchanged","ServiceAccount/flux-system/notification-controller":"unchanged","ServiceAccount/flux-system/source-controller":"unchanged"},"revision":"main@sha1:e0ab19509563f9cde3544a42264fae5762f7df6f"}🎁
"
{"timestamp": "2025-12-26 16:18:35,611", "name": "app", "level": "INFO", "message": "🎁E1226 15:13:22.111252 1 reflector.go:205] "Failed to watch" err="failed to list *v1.VolumeSnapshotClass: the server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)" logger="UnhandledError" reflector="github.com/kubernetes-csi/external-snapshotter/client/v8/informers/externalversions/factory.go:142" type="*v1.VolumeSnapshotClass"🎁
to this
Here's a breakdown of the issue and how to address it
Understanding the Problem
- CSI Drivers and Volume Snapshots: The external-snapshotter CSI driver is responsible for taking snapshots of persistent volumes within your Kubernetes cluster. It interacts directly with the underlying storage system.
- "Failed to List": This error message means the driver couldn't communicate with the storage system to retrieve the necessary metadata to create or manage volume snapshots. The driver is essentially unable to query the storage system.
- Root Cause: The most common causes for this error are:
- Storage System Connectivity Issues: The CSI driver isn't able to reach the storage system (e.g., AWS EBS, Azure Disk, Google Persistent Disk, etc.). This could be due to network problems, incorrect credentials, or firewall restrictions.
- CSI Driver Configuration Errors: The driver's configuration file (often located in
/etc/cni/net.d/or a similar path, depending on your Kubernetes distribution) might have incorrect settings. - Storage System Problems: The underlying storage system itself might be experiencing issues (e.g., temporary outages, performance problems).
- Driver Bugs: Although less common, a bug in the external-snapshotter driver could be causing the problem.
Troubleshooting Steps
- Verify Network Connectivity:
- From the Kubernetes Node: Use
ping,traceroute, ornslookupto verify that the Kubernetes node where the CSI driver is running can reach the storage system's endpoint. Pay close attention to DNS resolution. -
Firewall Rules: Check your firewall rules on both the Kubernetes node and the storage system to ensure that traffic on the necessary ports is allowed. The CSI driver usually uses ports 32768-32783 (UDP and TCP).
-
Inspect CSI Driver Configuration:
- Locate the Configuration File: Find the CSI driver's configuration file. This is the most crucial step.
- Check Credentials: Ensure that the
usernameandpassword(or API keys) in the configuration file are correct and have the necessary permissions to access the storage system. These are the credentials used to authenticate with the storage provider. - Verify Endpoint: Double-check that the
endpointURL is accurate and points to the correct storage system. -
Look for Errors: Carefully examine the entire configuration file for any typos or invalid values.
-
Check Storage System Status:
- Storage Provider Dashboard: Log in to your storage provider's console (AWS, Azure, Google Cloud) and check the status of your storage volumes. Look for any errors or outages.
-
Storage Volume Health: Investigate the health of the specific volumes that the CSI driver is trying to use.
-
Update CSI Driver:
- Check for Updates: The external-snapshotter CSI driver is frequently updated to address bugs and improve compatibility. Check the driver's documentation or release notes to see if a newer version is available.
-
Upgrade: If a newer version is available, upgrade the CSI driver.
-
Kubernetes Logs: Examine the logs of the CSI driver pod on your Kubernetes node. These logs often provide more detailed information about the error. The logs will frequently point to the root of the problem. Search for "error", "failed", or "timeout" within the logs.
-
Kubernetes Events: Use
kubectl get eventsto see if there are any events related to the CSI driver or volume snapshots.