High-Performance Real-Time Data System

This document describes the high-performance, in-memory candle data system integrated into PKScreener for real-time stock screening.

Overview

The high-performance data system provides instant access to OHLCV candle data across all supported timeframes without external API rate limits or database latency. It replaces Yahoo Finance as the primary data source during market hours.

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        Zerodha Kite WebSocket API                           │
│                    (Real-time tick data stream)                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌────────────────────────────────────────────────────────────────────────────-─┐
│                           PKBrokers Layer                                    │
│  ┌─────────────────────────────────────────────────────────────────────┐     │
│  │  ZerodhaWebSocketClient → KiteTokenWatcher → InMemoryCandleStore    │     │
│  │                                                                     │     │
│  │  Features:                                                          │     │
│  │    • Real-time tick aggregation into OHLCV candles                  │     │
│  │    • All intervals: 1m, 2m, 3m, 4m, 5m, 10m, 15m, 30m, 60m, daily   │     │
│  │    • O(1) access time for any candle                                │     │
│  │    • Memory-efficient rolling windows                               │     │
│  │    • Auto-persistence every 5 minutes                               │     │
│  └─────────────────────────────────────────────────────────────────────┘     │
└────────────────────────────────────────────────────────────────────────────-─┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                          PKDevTools Layer                                   │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │  PKDataProvider - Unified data access with automatic fallback       │    │
│  │                                                                     │    │
│  │  Data Source Priority:                                              │    │
│  │    1. InMemoryCandleStore (real-time) ◄── Primary                   │    │
│  │    2. Local pickle files (cached)                                   │    │
│  │    3. Remote GitHub pickle files (fallback)                         │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                          PKScreener Layer                                   │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │  screenerStockDataFetcher - Enhanced with real-time support         │    │
│  │                                                                     │    │
│  │  New Methods:                                                       │    │
│  │    • getLatestPrice(symbol)                                         │    │
│  │    • getRealtimeOHLCV(symbol)                                       │    │
│  │    • isRealtimeDataAvailable()                                      │    │
│  │    • getAllRealtimeData()                                           │    │
│  └─────────────────────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────────┘

Supported Timeframes

Interval

Description

Candles Stored

Use Case

1m

1 minute

390

Ultra-short term, scalping

2m

2 minutes

195

Short term patterns

3m

3 minutes

130

Custom analysis

4m

4 minutes

98

Custom analysis

5m

5 minutes

78

Intraday trading

10m

10 minutes

39

Intraday swings

15m

15 minutes

26

Intraday trends

30m

30 minutes

13

Position trading

60m

60 minutes

7

Swing trading

day

Daily

365

Trend following

Usage

Basic Usage in Scans

from pkscreener.classes.Fetcher import screenerStockDataFetcher
from pkscreener.classes import ConfigManager

# Initialize fetcher
config = ConfigManager.tools()
fetcher = screenerStockDataFetcher(config)

# Check if real-time data is available
if fetcher.isRealtimeDataAvailable():
    print("Real-time data mode active!")
    
    # Get latest price
    price = fetcher.getLatestPrice("RELIANCE")
    print(f"RELIANCE price: ₹{price}")
    
    # Get real-time OHLCV
    ohlcv = fetcher.getRealtimeOHLCV("TCS")
    print(f"TCS - O:{ohlcv['open']} H:{ohlcv['high']} L:{ohlcv['low']} C:{ohlcv['close']}")
    
    # Get all market data
    all_data = fetcher.getAllRealtimeData()
    print(f"Tracking {len(all_data)} instruments")
else:
    print("Falling back to cached data")

Fetching Stock Data

# Fetch 5-minute candles
df = fetcher.fetchStockData(
    stockCode="RELIANCE",
    period="1d",
    duration="5m",
    exchangeSuffix=".NS"
)

# Data is automatically sourced from:
# 1. Real-time candle store (if available)
# 2. Local pickle files (if cached)
# 3. Remote GitHub files (fallback)

if df is not None:
    print(f"Got {len(df)} candles")
    print(df.tail())

Direct Access to PKDataProvider

from PKDevTools.classes.PKDataProvider import get_data_provider

provider = get_data_provider()

# Get stock data with automatic source selection
df = provider.get_stock_data("INFY", interval="15m", count=50)

# Get multiple stocks
data = provider.get_multiple_stocks(
    ["RELIANCE", "TCS", "INFY"],
    interval="day",
    count=100
)

# Check statistics
stats = provider.get_stats()
print(f"Real-time hits: {stats['realtime_hits']}")
print(f"Cache hits: {stats['cache_hits']}")
print(f"Pickle hits: {stats['pickle_hits']}")

Benefits

Removed Dependencies

Before

After

Yahoo Finance API

❌ Removed

yfinance rate limits

❌ No limits

External API latency

❌ In-memory access

API downtime issues

❌ Local fallback

Performance Improvements

Metric

Before (Yahoo)

After (Real-time)

Data Latency

500ms - 2s

< 1ms

Rate Limits

2000/hour

None

API Failures

Common

N/A

Market Hours Data

Delayed

Real-time

Multi-timeframe

Multiple API calls

Single store access

Data Source Priority

The system automatically selects the best available data source:

┌─────────────────────────────────────────────────────────────────┐
│                    Data Request                                 │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │ Is Real-time    │
                    │ Available?      │
                    └────────┬────────┘
                             │
              ┌──────────────┴──────────────┐
              │ Yes                         │ No
              ▼                             ▼
    ┌─────────────────┐           ┌─────────────────┐
    │ InMemoryCandle  │           │ Local Pickle    │
    │ Store           │           │ Files           │
    └────────┬────────┘           └────────┬────────┘
             │                              │
             │ Data Found?                  │ Data Found?
             │                              │
        ┌────┴────┐                   ┌────-┴───┐
        │Yes  │No │                   │Yes  │No │
        ▼     ▼   │                   ▼     ▼   │
    ┌──────┐ │    │               ┌──────┐      │
    │Return│ │    │               │Return│      ▼
    │Data  │ │    │               │Data  │  ┌──────────────┐
    └──────┘ │    │               └──────┘  │Remote GitHub │
             │    │                         │Pickle Files  │
             ▼    ▼                         └──────────────┘
         ┌────────────────────────────────────────────────┐
         │              Fallback Chain                    │
         │  Real-time → Local Pickle → Remote Pickle      │
         └────────────────────────────────────────────────┘

Configuration

Enabling Real-Time Data

Real-time data requires PKBrokers to be running with active Zerodha WebSocket connection:

# In PKBrokers - Start the tick watcher
from pkbrokers.kite.kiteTokenWatcher import KiteTokenWatcher

watcher = KiteTokenWatcher()
watcher.watch()  # Starts WebSocket connections

# Real-time data is now available in PKScreener

Environment Variables

# Kite credentials (required for real-time)
export KTOKEN="your_kite_enctoken"
export KUSER="your_kite_user_id"

# Optional: Enable database persistence
export DB_TICKS="1"

Troubleshooting

No Real-Time Data Available

from pkscreener.classes.Fetcher import screenerStockDataFetcher

fetcher = screenerStockDataFetcher()

if not fetcher.isRealtimeDataAvailable():
    # Check why
    if fetcher._hp_provider is None:
        print("PKBrokers not installed or PKDataProvider not available")
    else:
        stats = fetcher._hp_provider.get_stats()
        print(f"Instruments: {stats.get('instrument_count', 0)}")
        print(f"Last tick: {stats.get('last_tick_time', 'N/A')}")

Data Not Updating

  1. Check if KiteTokenWatcher is running

  2. Verify WebSocket connections are active

  3. Check network connectivity

  4. Review PKBrokers logs for errors

Memory Usage

from PKDevTools.classes.PKDataProvider import get_data_provider

provider = get_data_provider()
stats = provider.get_stats()
print(f"Cache size: {stats['cache_size']}")

# Clear cache if needed
provider.clear_cache()

24x7 Data Availability

The high-performance data system is designed to work 24x7, ensuring stock data is always available for scans:

Time Period

Data Source

Description

Market Hours (9:15 AM - 3:30 PM IST)

InMemoryCandleStore

Real-time tick aggregation

After Market

Pickle Files

EOD data from w9-workflow

Weekends/Holidays

Cached Data

Last trading session data

Data Source Priority (24x7)

+-------------------------------------------------------------------+
| Priority 1: InMemoryCandleStore (Real-time)                       |
|    -> Live tick data during market hours                          |
|                                                                   |
| Priority 2: PKScalableDataFetcher (GitHub Raw)                    |
|    -> Pre-published data via w-data-publisher.yml                 |
|                                                                   |
| Priority 3: Local Pickle Cache                                    |
|    -> Downloaded data from previous sessions                      |
|                                                                   |
| Priority 4: Remote GitHub Pickle Files                            |
|    -> 52-week historical data from w9-workflow                    |
+-------------------------------------------------------------------+

How It Works

  1. During Market Hours: Real-time ticks from Zerodha WebSocket are aggregated into candles

  2. After Market Close: w9-workflow downloads 52-week data and saves to pickle files

  3. 24x7 Publisher: w-data-publisher.yml runs every 5 min during market, every 2 hours otherwise

  4. Scan Anytime: Users can trigger scans from Telegram bot at any time with available data

See Scalable Architecture for detailed 24x7 implementation.

See Also