Initial commit: Japan Senior News Collector
- FastAPI backend with news scraping from Yahoo Japan - SQLite database for article storage - Web UI with dark mode, article modal, statistics dashboard - Docker support for containerized deployment - API endpoints: /api/today, /api/news, /api/collect-news, /api/dates, /api/download-json - Auto-collect feature when requesting today news - Content filtering for articles without body text
This commit is contained in:
7
.dockerignore
Normal file
7
.dockerignore
Normal file
@@ -0,0 +1,7 @@
|
||||
venv/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
*.pyo
|
||||
.git/
|
||||
.gitignore
|
||||
*.db
|
||||
24
.gitignore
vendored
Normal file
24
.gitignore
vendored
Normal file
@@ -0,0 +1,24 @@
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
venv/
|
||||
.env
|
||||
|
||||
# Database
|
||||
*.db
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
|
||||
# Debug/Test files
|
||||
debug_output.txt
|
||||
test_*.py
|
||||
|
||||
# Generated files
|
||||
*.json
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
52
DEPLOY.md
Normal file
52
DEPLOY.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# Deployment Guide for Japan News Collector
|
||||
|
||||
This guide explains how to deploy the application using Docker on your home server or any machine with Docker installed.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- [Docker](https://docs.docker.com/get-docker/) installed.
|
||||
- [Docker Compose](https://docs.docker.com/compose/install/) installed (usually included with Docker Desktop/Docker Engine).
|
||||
|
||||
## Files Overview
|
||||
|
||||
- **Dockerfile**: Defines the environment (Python 3.9) and dependencies.
|
||||
- **docker-compose.yml**: Orchestrates the container, maps ports (8000), and persists data (`news.db`).
|
||||
|
||||
## Deployment Steps
|
||||
|
||||
1. **Transfer Files**: Copy the entire project folder to your server.
|
||||
2. **Navigate to Directory**:
|
||||
```bash
|
||||
cd japan-news
|
||||
```
|
||||
3. **Start the Service**:
|
||||
Run the following command to build and start the container in the background:
|
||||
```bash
|
||||
docker-compose up -d --build
|
||||
```
|
||||
|
||||
## Managing the Service
|
||||
|
||||
- **Check Logs**:
|
||||
```bash
|
||||
docker-compose logs -f
|
||||
```
|
||||
- **Stop the Service**:
|
||||
```bash
|
||||
docker-compose down
|
||||
```
|
||||
- **Restart**:
|
||||
```bash
|
||||
docker-compose restart
|
||||
```
|
||||
|
||||
## Data Persistence
|
||||
|
||||
The database file `news.db` is mapped to the container.
|
||||
- Even if you stop or remove the container, your data in `news.db` on the host machine will remain safe.
|
||||
- **Backup**: Simply backup the `news.db` file.
|
||||
|
||||
## Accessing the Application
|
||||
|
||||
Open your browser and navigate to:
|
||||
`http://localhost:8000` (or your server's IP address: `http://<server-ip>:8000`)
|
||||
15
Dockerfile
Normal file
15
Dockerfile
Normal file
@@ -0,0 +1,15 @@
|
||||
FROM python:3.9-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY . .
|
||||
|
||||
# Create static directory if it doesn't exist
|
||||
RUN mkdir -p static
|
||||
|
||||
EXPOSE 8000
|
||||
|
||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
181
README.md
Normal file
181
README.md
Normal file
@@ -0,0 +1,181 @@
|
||||
# Japan Senior News Collector
|
||||
|
||||
일본 Yahoo Japan에서 시니어 관련 뉴스를 자동으로 수집하고 관리하는 웹 애플리케이션입니다.
|
||||
|
||||
## 주요 기능
|
||||
|
||||
### 뉴스 수집
|
||||
- Yahoo Japan에서 4개 카테고리 뉴스 자동 수집
|
||||
- 건강 (Health)
|
||||
- 생활 (Lifestyle)
|
||||
- 경제 (Economy)
|
||||
- 사회 (Society)
|
||||
- 각 카테고리별 최대 5개 기사 수집
|
||||
- 기사 본문 콘텐츠 자동 추출
|
||||
- 콘텐츠가 없는 기사 자동 필터링
|
||||
|
||||
### 웹 UI
|
||||
- 카테고리별 뉴스 카드 뷰
|
||||
- 기사 클릭 시 상세 모달 표시
|
||||
- 날짜별 히스토리 조회
|
||||
- 다크모드 지원
|
||||
- 통계 대시보드 (카테고리별 기사 수)
|
||||
- JSON 다운로드 기능
|
||||
|
||||
## 기술 스택
|
||||
|
||||
- **Backend**: FastAPI, Python 3.9
|
||||
- **Database**: SQLite
|
||||
- **Frontend**: HTML, Tailwind CSS, JavaScript
|
||||
- **Scraping**: BeautifulSoup4, Requests
|
||||
- **Container**: Docker
|
||||
|
||||
## 설치 및 실행
|
||||
|
||||
### Docker 실행 (권장)
|
||||
|
||||
```bash
|
||||
# 이미지 빌드
|
||||
docker build -t japan-news .
|
||||
|
||||
# 컨테이너 실행
|
||||
docker run -d --name japan-news -p 8001:8000 japan-news
|
||||
```
|
||||
|
||||
### 로컬 실행
|
||||
|
||||
```bash
|
||||
# 가상환경 생성 및 활성화
|
||||
python -m venv venv
|
||||
source venv/bin/activate # Windows: venv\Scripts\activate
|
||||
|
||||
# 의존성 설치
|
||||
pip install -r requirements.txt
|
||||
|
||||
# 서버 실행
|
||||
uvicorn main:app --reload --port 8000
|
||||
```
|
||||
|
||||
## API 엔드포인트
|
||||
|
||||
### GET /
|
||||
웹 UI 페이지 반환
|
||||
|
||||
### GET /api/today
|
||||
오늘의 뉴스 조회. 오늘 수집된 기사가 없으면 자동으로 수집 후 반환.
|
||||
|
||||
**응답 예시:**
|
||||
```json
|
||||
{
|
||||
"date": "2025-12-15",
|
||||
"articles": {
|
||||
"Economy": [...],
|
||||
"Society": [...],
|
||||
"Lifestyle": [...],
|
||||
"Health": [...]
|
||||
},
|
||||
"total_count": 19
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/news
|
||||
뉴스 목록 조회
|
||||
|
||||
**Query Parameters:**
|
||||
- `date` (optional): 조회할 날짜 (YYYY-MM-DD 형식)
|
||||
|
||||
### POST /api/collect-news
|
||||
뉴스 수집 실행
|
||||
|
||||
**응답 예시:**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"collected_count": 20,
|
||||
"details": {
|
||||
"Economy": 5,
|
||||
"Society": 5,
|
||||
"Lifestyle": 5,
|
||||
"Health": 5
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/dates
|
||||
수집된 날짜 목록 조회
|
||||
|
||||
**응답 예시:**
|
||||
```json
|
||||
{
|
||||
"dates": ["2025-12-15", "2025-12-14", "2025-12-13"]
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/download-json
|
||||
뉴스 데이터 JSON 파일 다운로드
|
||||
|
||||
**Query Parameters:**
|
||||
- `date` (optional): 다운로드할 날짜 (YYYY-MM-DD 형식)
|
||||
|
||||
## 프로젝트 구조
|
||||
|
||||
```
|
||||
japan-news/
|
||||
├── main.py # FastAPI 애플리케이션
|
||||
├── database.py # SQLite 데이터베이스 관리
|
||||
├── scraper.py # Yahoo Japan 뉴스 스크래퍼
|
||||
├── requirements.txt # Python 의존성
|
||||
├── Dockerfile # Docker 설정
|
||||
├── .dockerignore # Docker 빌드 제외 파일
|
||||
├── static/
|
||||
│ └── index.html # 웹 UI
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## 데이터베이스 스키마
|
||||
|
||||
### articles 테이블
|
||||
|
||||
| 컬럼 | 타입 | 설명 |
|
||||
|------|------|------|
|
||||
| id | INTEGER | Primary Key |
|
||||
| title | TEXT | 기사 제목 |
|
||||
| url | TEXT | 기사 URL (UNIQUE) |
|
||||
| image_url | TEXT | 썸네일 이미지 URL |
|
||||
| published_date | TEXT | 발행일 |
|
||||
| category | TEXT | 카테고리 |
|
||||
| source | TEXT | 출처 |
|
||||
| collected_at | TEXT | 수집 시간 (ISO format) |
|
||||
| content | TEXT | 기사 본문 |
|
||||
|
||||
## 외부 연동
|
||||
|
||||
`/api/today` 엔드포인트를 사용하면 외부 시스템에서 오늘의 뉴스를 쉽게 조회할 수 있습니다.
|
||||
|
||||
```bash
|
||||
# 오늘의 뉴스 조회 (없으면 자동 수집)
|
||||
curl http://localhost:8001/api/today
|
||||
```
|
||||
|
||||
## Docker 관리 명령어
|
||||
|
||||
```bash
|
||||
# 로그 확인
|
||||
docker logs -f japan-news
|
||||
|
||||
# 컨테이너 중지
|
||||
docker stop japan-news
|
||||
|
||||
# 컨테이너 시작
|
||||
docker start japan-news
|
||||
|
||||
# 컨테이너 삭제
|
||||
docker rm -f japan-news
|
||||
|
||||
# 이미지 재빌드 후 실행
|
||||
docker rm -f japan-news && docker build -t japan-news . && docker run -d --name japan-news -p 8001:8000 japan-news
|
||||
```
|
||||
|
||||
## 라이선스
|
||||
|
||||
MIT License
|
||||
103
database.py
Normal file
103
database.py
Normal file
@@ -0,0 +1,103 @@
|
||||
import sqlite3
|
||||
from datetime import datetime, date
|
||||
from typing import List, Optional
|
||||
from pydantic import BaseModel
|
||||
|
||||
DB_NAME = "news.db"
|
||||
|
||||
class Article(BaseModel):
|
||||
title: str
|
||||
url: str
|
||||
image_url: Optional[str] = None
|
||||
published_date: Optional[str] = None
|
||||
category: str
|
||||
source: str = "Yahoo Japan"
|
||||
collected_at: str = datetime.now().isoformat()
|
||||
content: Optional[str] = None
|
||||
|
||||
def init_db():
|
||||
conn = sqlite3.connect(DB_NAME)
|
||||
c = conn.cursor()
|
||||
c.execute('''
|
||||
CREATE TABLE IF NOT EXISTS articles (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
title TEXT NOT NULL,
|
||||
url TEXT UNIQUE NOT NULL,
|
||||
image_url TEXT,
|
||||
published_date TEXT,
|
||||
category TEXT,
|
||||
source TEXT,
|
||||
collected_at TEXT,
|
||||
content TEXT
|
||||
)
|
||||
''')
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def save_article(article: Article):
|
||||
conn = sqlite3.connect(DB_NAME)
|
||||
c = conn.cursor()
|
||||
try:
|
||||
# Check if content column exists (for migration)
|
||||
cursor = c.execute("PRAGMA table_info(articles)")
|
||||
columns = [info[1] for info in cursor.fetchall()]
|
||||
if 'content' not in columns:
|
||||
c.execute("ALTER TABLE articles ADD COLUMN content TEXT")
|
||||
conn.commit()
|
||||
|
||||
c.execute('''
|
||||
INSERT INTO articles (title, url, image_url, published_date, category, source, collected_at, content)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
ON CONFLICT(url) DO UPDATE SET
|
||||
content = excluded.content,
|
||||
image_url = excluded.image_url,
|
||||
published_date = excluded.published_date
|
||||
''', (article.title, article.url, article.image_url, article.published_date, article.category, article.source, article.collected_at, article.content))
|
||||
conn.commit()
|
||||
except Exception as e:
|
||||
print(f"Error saving article: {e}")
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
def get_articles(category: Optional[str] = None, collection_date: Optional[str] = None, limit: int = 5) -> List[dict]:
|
||||
conn = sqlite3.connect(DB_NAME)
|
||||
conn.row_factory = sqlite3.Row
|
||||
c = conn.cursor()
|
||||
|
||||
# Filter out articles without content
|
||||
query = "SELECT * FROM articles WHERE content IS NOT NULL AND content != '' AND content != 'Content not found.'"
|
||||
params = []
|
||||
|
||||
if category:
|
||||
query += " AND category = ?"
|
||||
params.append(category)
|
||||
|
||||
if collection_date:
|
||||
# Assuming collection_date is 'YYYY-MM-DD' and collected_at is ISO format
|
||||
query += " AND date(collected_at) = ?"
|
||||
params.append(collection_date)
|
||||
else:
|
||||
# Default to today if no date specified? Or just get latest?
|
||||
# User said "collect news based on today".
|
||||
# But for viewing, maybe we just want the latest batch.
|
||||
# Let's order by collected_at desc
|
||||
pass
|
||||
|
||||
query += " ORDER BY collected_at DESC LIMIT ?"
|
||||
params.append(limit)
|
||||
|
||||
c.execute(query, tuple(params))
|
||||
rows = c.fetchall()
|
||||
conn.close()
|
||||
|
||||
|
||||
return [dict(row) for row in rows]
|
||||
|
||||
def get_available_dates() -> List[str]:
|
||||
conn = sqlite3.connect(DB_NAME)
|
||||
c = conn.cursor()
|
||||
# Extract distinct dates YYYY-MM-DD
|
||||
c.execute("SELECT DISTINCT date(collected_at) as date_val FROM articles ORDER BY date_val DESC")
|
||||
rows = c.fetchall()
|
||||
conn.close()
|
||||
return [row[0] for row in rows if row[0]]
|
||||
11
docker-compose.yml
Normal file
11
docker-compose.yml
Normal file
@@ -0,0 +1,11 @@
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
japan-news:
|
||||
build: .
|
||||
container_name: japan-news-collector
|
||||
ports:
|
||||
- "8000:8000"
|
||||
volumes:
|
||||
- ./news.db:/app/news.db
|
||||
restart: unless-stopped
|
||||
126
main.py
Normal file
126
main.py
Normal file
@@ -0,0 +1,126 @@
|
||||
from fastapi import FastAPI, HTTPException
|
||||
from fastapi.responses import FileResponse
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from pydantic import BaseModel
|
||||
from typing import List, Dict
|
||||
import database
|
||||
from scraper import NewsScraper
|
||||
from datetime import datetime
|
||||
import os
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
# Initialize DB
|
||||
database.init_db()
|
||||
|
||||
# Serve static files
|
||||
os.makedirs("static", exist_ok=True)
|
||||
app.mount("/static", StaticFiles(directory="static"), name="static")
|
||||
|
||||
@app.get("/")
|
||||
async def read_root():
|
||||
return FileResponse('static/index.html')
|
||||
|
||||
@app.post("/api/collect-news")
|
||||
async def collect_news():
|
||||
scraper = NewsScraper()
|
||||
categories = ["Economy", "Society", "Lifestyle", "Health"]
|
||||
results = {}
|
||||
|
||||
total_count = 0
|
||||
for cat in categories:
|
||||
articles = scraper.scrape_category(cat, limit=5)
|
||||
for article in articles:
|
||||
database.save_article(article)
|
||||
results[cat] = len(articles)
|
||||
total_count += len(articles)
|
||||
|
||||
return {"status": "success", "collected_count": total_count, "details": results}
|
||||
|
||||
@app.get("/api/dates")
|
||||
async def get_dates():
|
||||
dates = database.get_available_dates()
|
||||
return {"dates": dates}
|
||||
|
||||
from fastapi.responses import FileResponse, Response
|
||||
import json
|
||||
|
||||
@app.get("/api/download-json")
|
||||
async def download_json(date: str = None):
|
||||
# Reuse get_news logic
|
||||
categories = ["Economy", "Society", "Lifestyle", "Health"]
|
||||
data = {}
|
||||
|
||||
# Use provided date or today/latest if None
|
||||
# If date is None, get_articles(limit=5) gets latest regardless of date.
|
||||
# To be precise for file name, if date is None, we might want to know the date of collected items?
|
||||
# For now let's just use the logic we have.
|
||||
|
||||
for cat in categories:
|
||||
articles = database.get_articles(category=cat, collection_date=date, limit=5)
|
||||
# Convert sqlite rows to dicts if not already (get_articles does it)
|
||||
data[cat] = articles
|
||||
|
||||
file_date = date if date else datetime.now().strftime("%Y-%m-%d")
|
||||
filename = f"japan-news-{file_date}.json"
|
||||
json_content = json.dumps(data, indent=2, ensure_ascii=False)
|
||||
|
||||
return Response(
|
||||
content=json_content,
|
||||
media_type="application/json",
|
||||
headers={"Content-Disposition": f"attachment; filename={filename}"}
|
||||
)
|
||||
|
||||
@app.get("/api/news")
|
||||
async def get_news(date: str = None):
|
||||
# Helper to restructure for frontend
|
||||
categories = ["Economy", "Society", "Lifestyle", "Health"]
|
||||
response_data = {}
|
||||
|
||||
for cat in categories:
|
||||
articles = database.get_articles(category=cat, collection_date=date, limit=5)
|
||||
response_data[cat] = articles
|
||||
|
||||
return response_data
|
||||
|
||||
|
||||
@app.get("/api/today")
|
||||
async def get_today_news():
|
||||
"""
|
||||
Get today's news. If no articles exist for today, collect them first.
|
||||
Returns JSON with all categories.
|
||||
"""
|
||||
today = datetime.now().strftime("%Y-%m-%d")
|
||||
categories = ["Economy", "Society", "Lifestyle", "Health"]
|
||||
|
||||
# Check if we have any articles for today
|
||||
has_today_articles = False
|
||||
for cat in categories:
|
||||
articles = database.get_articles(category=cat, collection_date=today, limit=1)
|
||||
if articles:
|
||||
has_today_articles = True
|
||||
break
|
||||
|
||||
# If no articles for today, collect them
|
||||
if not has_today_articles:
|
||||
scraper = NewsScraper()
|
||||
for cat in categories:
|
||||
articles = scraper.scrape_category(cat, limit=5)
|
||||
for article in articles:
|
||||
database.save_article(article)
|
||||
|
||||
# Return today's articles
|
||||
response_data = {
|
||||
"date": today,
|
||||
"articles": {}
|
||||
}
|
||||
|
||||
total_count = 0
|
||||
for cat in categories:
|
||||
articles = database.get_articles(category=cat, collection_date=today, limit=5)
|
||||
response_data["articles"][cat] = articles
|
||||
total_count += len(articles)
|
||||
|
||||
response_data["total_count"] = total_count
|
||||
|
||||
return response_data
|
||||
5
requirements.txt
Normal file
5
requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
fastapi
|
||||
uvicorn
|
||||
beautifulsoup4
|
||||
requests
|
||||
pydantic
|
||||
167
scraper.py
Normal file
167
scraper.py
Normal file
@@ -0,0 +1,167 @@
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
from typing import List, Optional
|
||||
from database import Article
|
||||
from datetime import datetime
|
||||
import time
|
||||
import random
|
||||
|
||||
class NewsScraper:
|
||||
BASE_URL = "https://news.yahoo.co.jp"
|
||||
|
||||
CATEGORIES = {
|
||||
"Economy": "https://news.yahoo.co.jp/categories/business",
|
||||
"Society": "https://news.yahoo.co.jp/categories/domestic",
|
||||
"Lifestyle": "https://news.yahoo.co.jp/categories/life",
|
||||
"Health": "https://news.yahoo.co.jp/search?p=%E5%81%A5%E5%BA%B7&ei=utf-8" # Search for 'Health'
|
||||
}
|
||||
|
||||
HEALTH_KEYWORDS = ["健康", "医療", "病気", "病院", "医師", "薬", "ワクチン", "感染", "介護", "認知症", "老化", "ダイエット", "運動", "睡眠", "ストレス", "メンタル"]
|
||||
|
||||
def scrape_category(self, category_name: str, limit: int = 5) -> List[Article]:
|
||||
url = self.CATEGORIES.get(category_name)
|
||||
if not url:
|
||||
print(f"Unknown category: {category_name}")
|
||||
return []
|
||||
|
||||
print(f"Scraping {category_name} from {url}...")
|
||||
try:
|
||||
response = requests.get(url)
|
||||
response.raise_for_status()
|
||||
soup = BeautifulSoup(response.content, "html.parser")
|
||||
|
||||
articles = []
|
||||
|
||||
# Find all links that look like article links
|
||||
# Yahoo Japan News article links typically contain 'news.yahoo.co.jp/articles/' or 'news.yahoo.co.jp/pickup/'
|
||||
candidates = soup.find_all('a')
|
||||
print(f"Found {len(candidates)} total links")
|
||||
|
||||
seen_urls = set()
|
||||
|
||||
for link in candidates:
|
||||
if len(articles) >= limit:
|
||||
break
|
||||
|
||||
href = link.get('href')
|
||||
if not href:
|
||||
continue
|
||||
|
||||
if 'news.yahoo.co.jp/articles/' in href or 'news.yahoo.co.jp/pickup/' in href:
|
||||
# Clean up URL
|
||||
if href.startswith('/'):
|
||||
href = self.BASE_URL + href
|
||||
|
||||
if href in seen_urls:
|
||||
continue
|
||||
|
||||
# Extract title
|
||||
title = link.get_text(strip=True)
|
||||
if len(title) < 5:
|
||||
continue
|
||||
|
||||
if category_name == "Health":
|
||||
pass
|
||||
|
||||
# Image extraction
|
||||
img_tag = link.find('img')
|
||||
image_url = img_tag.get('src') if img_tag else None
|
||||
|
||||
seen_urls.add(href)
|
||||
|
||||
print(f"Found article: {title}")
|
||||
|
||||
# Handle Pickup URLs - Resolve to real article URL
|
||||
final_url = href
|
||||
if "/pickup/" in href:
|
||||
print(f" Resolving pickup URL: {href}")
|
||||
real_url = self.resolve_pickup_url(href)
|
||||
if real_url:
|
||||
print(f" -> Resolved to: {real_url}")
|
||||
final_url = real_url
|
||||
|
||||
article = Article(
|
||||
title=title,
|
||||
url=final_url, # Store the final URL
|
||||
image_url=image_url,
|
||||
category=category_name,
|
||||
published_date=datetime.now().strftime("%Y-%m-%d"),
|
||||
collected_at=datetime.now().isoformat()
|
||||
)
|
||||
|
||||
# Fetch Full Content
|
||||
try:
|
||||
print(f" Fetching content for {title[:10]}...")
|
||||
content = self.scrape_article_body(final_url)
|
||||
article.content = content
|
||||
time.sleep(random.uniform(0.5, 1.5))
|
||||
except Exception as e:
|
||||
print(f" Failed to fetch content: {e}")
|
||||
article.content = "Failed to load content."
|
||||
|
||||
articles.append(article)
|
||||
|
||||
print(f"Total articles collected for {category_name}: {len(articles)}")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error scraping {category_name}: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return []
|
||||
|
||||
return articles
|
||||
|
||||
def resolve_pickup_url(self, pickup_url: str) -> Optional[str]:
|
||||
try:
|
||||
response = requests.get(pickup_url)
|
||||
response.raise_for_status()
|
||||
soup = BeautifulSoup(response.content, "html.parser")
|
||||
|
||||
# Look for "続きを読む" link
|
||||
link = soup.find('a', string=lambda t: t and '続きを読む' in t)
|
||||
if link and link.get('href'):
|
||||
return link.get('href')
|
||||
|
||||
# Fallback: look for any news.yahoo.co.jp/articles/ link in the main content area
|
||||
# Usually pickup pages have a clear link to the full story
|
||||
candidates = soup.find_all('a')
|
||||
for l in candidates:
|
||||
href = l.get('href')
|
||||
if href and 'news.yahoo.co.jp/articles/' in href:
|
||||
return href
|
||||
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f"Error resolving pickup URL {pickup_url}: {e}")
|
||||
return None
|
||||
|
||||
def scrape_article_body(self, url: str) -> str:
|
||||
try:
|
||||
response = requests.get(url)
|
||||
response.raise_for_status()
|
||||
soup = BeautifulSoup(response.content, "html.parser")
|
||||
|
||||
# Selector identified via browser tool: div.sc-iMCRTP.eqMceQ.yjSlinkDirectLink (or generic sc-iMCRTP)
|
||||
# Collecting all paragraphs within the main article body container
|
||||
# We look for the container div
|
||||
container = soup.select_one("div.sc-iMCRTP")
|
||||
if not container:
|
||||
# Fallback to article_body class search if specific class changes
|
||||
container = soup.find("div", class_=lambda x: x and "article_body" in x)
|
||||
|
||||
if container:
|
||||
paragraphs = container.find_all('p')
|
||||
text = "\n\n".join([p.get_text(strip=True) for p in paragraphs])
|
||||
return text
|
||||
|
||||
return "Content not found."
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error scraping body from {url}: {e}")
|
||||
return ""
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Test run
|
||||
scraper = NewsScraper()
|
||||
news = scraper.scrape_category("Society", limit=2)
|
||||
print(news)
|
||||
667
static/index.html
Normal file
667
static/index.html
Normal file
@@ -0,0 +1,667 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="ja">
|
||||
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Japan Senior News Collector</title>
|
||||
<script src="https://cdn.tailwindcss.com"></script>
|
||||
<script>
|
||||
tailwind.config = {
|
||||
darkMode: 'class',
|
||||
theme: {
|
||||
extend: {
|
||||
animation: {
|
||||
'fade-in': 'fadeIn 0.3s ease-out',
|
||||
'slide-up': 'slideUp 0.3s ease-out',
|
||||
'slide-in': 'slideIn 0.3s ease-out',
|
||||
'pulse-slow': 'pulse 3s infinite',
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
</script>
|
||||
<style>
|
||||
@keyframes fadeIn {
|
||||
from { opacity: 0; }
|
||||
to { opacity: 1; }
|
||||
}
|
||||
@keyframes slideUp {
|
||||
from { opacity: 0; transform: translateY(20px); }
|
||||
to { opacity: 1; transform: translateY(0); }
|
||||
}
|
||||
@keyframes slideIn {
|
||||
from { opacity: 0; transform: translateX(-20px); }
|
||||
to { opacity: 1; transform: translateX(0); }
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'Hiragino Kaku Gothic Pro', 'Meiryo', sans-serif;
|
||||
}
|
||||
|
||||
.scrollbar-thin::-webkit-scrollbar {
|
||||
width: 6px;
|
||||
}
|
||||
.scrollbar-thin::-webkit-scrollbar-track {
|
||||
background: transparent;
|
||||
}
|
||||
.scrollbar-thin::-webkit-scrollbar-thumb {
|
||||
background: #cbd5e1;
|
||||
border-radius: 3px;
|
||||
}
|
||||
.dark .scrollbar-thin::-webkit-scrollbar-thumb {
|
||||
background: #475569;
|
||||
}
|
||||
|
||||
.line-clamp-2 {
|
||||
display: -webkit-box;
|
||||
-webkit-line-clamp: 2;
|
||||
-webkit-box-orient: vertical;
|
||||
overflow: hidden;
|
||||
}
|
||||
.line-clamp-3 {
|
||||
display: -webkit-box;
|
||||
-webkit-line-clamp: 3;
|
||||
-webkit-box-orient: vertical;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.card-hover {
|
||||
transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1);
|
||||
}
|
||||
.card-hover:hover {
|
||||
transform: translateY(-4px);
|
||||
box-shadow: 0 20px 25px -5px rgba(0, 0, 0, 0.1), 0 10px 10px -5px rgba(0, 0, 0, 0.04);
|
||||
}
|
||||
.dark .card-hover:hover {
|
||||
box-shadow: 0 20px 25px -5px rgba(0, 0, 0, 0.4), 0 10px 10px -5px rgba(0, 0, 0, 0.2);
|
||||
}
|
||||
|
||||
.glass-effect {
|
||||
backdrop-filter: blur(10px);
|
||||
background: rgba(255, 255, 255, 0.8);
|
||||
}
|
||||
.dark .glass-effect {
|
||||
background: rgba(30, 41, 59, 0.8);
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
|
||||
<body class="bg-gray-100 dark:bg-slate-900 text-gray-800 dark:text-gray-200 h-screen overflow-hidden flex transition-colors duration-300">
|
||||
|
||||
<!-- Sidebar -->
|
||||
<aside class="w-72 bg-white dark:bg-slate-800 shadow-lg flex-shrink-0 flex flex-col h-full border-r border-gray-200 dark:border-slate-700 transition-colors duration-300">
|
||||
<div class="p-6 border-b border-gray-200 dark:border-slate-700">
|
||||
<h2 class="text-xl font-bold text-gray-800 dark:text-white flex items-center gap-2">
|
||||
<svg class="w-6 h-6 text-blue-600 dark:text-blue-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M8 7V3m8 4V3m-9 8h10M5 21h14a2 2 0 002-2V7a2 2 0 00-2-2H5a2 2 0 00-2 2v12a2 2 0 002 2z"/>
|
||||
</svg>
|
||||
News History
|
||||
</h2>
|
||||
</div>
|
||||
<div class="flex-1 overflow-y-auto p-4 space-y-2 scrollbar-thin" id="date-list">
|
||||
<div class="animate-pulse flex space-x-4">
|
||||
<div class="h-4 bg-gray-200 dark:bg-slate-700 rounded w-3/4"></div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Dark Mode Toggle -->
|
||||
<div class="p-4 border-t border-gray-200 dark:border-slate-700">
|
||||
<button id="darkModeToggle" class="w-full flex items-center justify-center gap-2 px-4 py-2 rounded-lg bg-gray-100 dark:bg-slate-700 hover:bg-gray-200 dark:hover:bg-slate-600 transition-colors">
|
||||
<svg id="sunIcon" class="w-5 h-5 hidden dark:block text-yellow-400" fill="currentColor" viewBox="0 0 20 20">
|
||||
<path fill-rule="evenodd" d="M10 2a1 1 0 011 1v1a1 1 0 11-2 0V3a1 1 0 011-1zm4 8a4 4 0 11-8 0 4 4 0 018 0zm-.464 4.95l.707.707a1 1 0 001.414-1.414l-.707-.707a1 1 0 00-1.414 1.414zm2.12-10.607a1 1 0 010 1.414l-.706.707a1 1 0 11-1.414-1.414l.707-.707a1 1 0 011.414 0zM17 11a1 1 0 100-2h-1a1 1 0 100 2h1zm-7 4a1 1 0 011 1v1a1 1 0 11-2 0v-1a1 1 0 011-1zM5.05 6.464A1 1 0 106.465 5.05l-.708-.707a1 1 0 00-1.414 1.414l.707.707zm1.414 8.486l-.707.707a1 1 0 01-1.414-1.414l.707-.707a1 1 0 011.414 1.414zM4 11a1 1 0 100-2H3a1 1 0 000 2h1z"/>
|
||||
</svg>
|
||||
<svg id="moonIcon" class="w-5 h-5 block dark:hidden text-slate-700" fill="currentColor" viewBox="0 0 20 20">
|
||||
<path d="M17.293 13.293A8 8 0 016.707 2.707a8.001 8.001 0 1010.586 10.586z"/>
|
||||
</svg>
|
||||
<span class="text-sm font-medium dark:hidden">Dark Mode</span>
|
||||
<span class="text-sm font-medium hidden dark:inline">Light Mode</span>
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<div class="p-4 border-t border-gray-200 dark:border-slate-700 text-xs text-gray-500 dark:text-gray-400 text-center">
|
||||
© 2025 Senior News
|
||||
</div>
|
||||
</aside>
|
||||
|
||||
<!-- Main Content -->
|
||||
<main class="flex-1 flex flex-col h-full overflow-hidden">
|
||||
|
||||
<!-- Header -->
|
||||
<header class="glass-effect shadow-sm px-8 py-4 flex justify-between items-center z-10 border-b border-gray-200 dark:border-slate-700 transition-colors duration-300">
|
||||
<div>
|
||||
<h1 class="text-2xl font-bold text-blue-900 dark:text-blue-400">Senior Daily News Collector</h1>
|
||||
<p id="current-view-label" class="text-sm text-gray-500 dark:text-gray-400">Viewing: Latest</p>
|
||||
</div>
|
||||
<div class="flex items-center gap-4">
|
||||
<button id="downloadBtn"
|
||||
class="bg-green-600 hover:bg-green-700 text-white font-bold py-2 px-6 rounded-lg shadow transition duration-200 flex items-center gap-2">
|
||||
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5" fill="none" viewBox="0 0 24 24"
|
||||
stroke="currentColor">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2"
|
||||
d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-4l-4 4m0 0l-4-4m4 4V4" />
|
||||
</svg>
|
||||
Download JSON
|
||||
</button>
|
||||
<button id="collectBtn"
|
||||
class="bg-blue-600 hover:bg-blue-700 text-white font-bold py-2 px-6 rounded-lg shadow transition duration-200 flex items-center gap-2">
|
||||
<svg xmlns="http://www.w3.org/2000/svg" class="h-5 w-5" fill="none" viewBox="0 0 24 24"
|
||||
stroke="currentColor">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2"
|
||||
d="M4 16v1a3 3 0 003 3h10a3 3 0 003-3v-1m-4-8l-4-4m0 0L8 8m4-4v12" />
|
||||
</svg>
|
||||
Collect Today's News
|
||||
</button>
|
||||
</div>
|
||||
</header>
|
||||
|
||||
<!-- Scrollable Content Area -->
|
||||
<div class="flex-1 overflow-y-auto p-8 relative scrollbar-thin">
|
||||
|
||||
<!-- Loading Overlay -->
|
||||
<div id="loading"
|
||||
class="hidden absolute inset-0 bg-white/80 dark:bg-slate-900/80 z-20 flex flex-col items-center justify-center backdrop-blur-sm">
|
||||
<div class="relative">
|
||||
<div class="animate-spin rounded-full h-16 w-16 border-4 border-blue-200 dark:border-blue-900 border-t-blue-600 dark:border-t-blue-400"></div>
|
||||
</div>
|
||||
<p class="text-blue-800 dark:text-blue-300 font-semibold mt-4">Processing...</p>
|
||||
</div>
|
||||
|
||||
<!-- Statistics Dashboard -->
|
||||
<div id="stats-dashboard" class="mb-8 animate-fade-in">
|
||||
<div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-5 gap-4 mb-6">
|
||||
<!-- Total Articles -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md p-4 card-hover border border-gray-100 dark:border-slate-700">
|
||||
<div class="flex items-center justify-between">
|
||||
<div>
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">Total Articles</p>
|
||||
<p id="stat-total" class="text-2xl font-bold text-gray-800 dark:text-white">0</p>
|
||||
</div>
|
||||
<div class="w-12 h-12 bg-blue-100 dark:bg-blue-900/50 rounded-full flex items-center justify-center">
|
||||
<svg class="w-6 h-6 text-blue-600 dark:text-blue-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 20H5a2 2 0 01-2-2V6a2 2 0 012-2h10a2 2 0 012 2v1m2 13a2 2 0 01-2-2V7m2 13a2 2 0 002-2V9a2 2 0 00-2-2h-2m-4-3H9M7 16h6M7 8h6v4H7V8z"/>
|
||||
</svg>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Category Stats -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md p-4 card-hover border border-green-100 dark:border-green-900/50">
|
||||
<div class="flex items-center justify-between">
|
||||
<div>
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">Health</p>
|
||||
<p id="stat-health" class="text-2xl font-bold text-green-600 dark:text-green-400">0</p>
|
||||
</div>
|
||||
<div class="w-12 h-12 bg-green-100 dark:bg-green-900/50 rounded-full flex items-center justify-center text-2xl">
|
||||
🏥
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md p-4 card-hover border border-orange-100 dark:border-orange-900/50">
|
||||
<div class="flex items-center justify-between">
|
||||
<div>
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">Lifestyle</p>
|
||||
<p id="stat-lifestyle" class="text-2xl font-bold text-orange-600 dark:text-orange-400">0</p>
|
||||
</div>
|
||||
<div class="w-12 h-12 bg-orange-100 dark:bg-orange-900/50 rounded-full flex items-center justify-center text-2xl">
|
||||
🏡
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md p-4 card-hover border border-blue-100 dark:border-blue-900/50">
|
||||
<div class="flex items-center justify-between">
|
||||
<div>
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">Economy</p>
|
||||
<p id="stat-economy" class="text-2xl font-bold text-blue-600 dark:text-blue-400">0</p>
|
||||
</div>
|
||||
<div class="w-12 h-12 bg-blue-100 dark:bg-blue-900/50 rounded-full flex items-center justify-center text-2xl">
|
||||
💼
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md p-4 card-hover border border-purple-100 dark:border-purple-900/50">
|
||||
<div class="flex items-center justify-between">
|
||||
<div>
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">Society</p>
|
||||
<p id="stat-society" class="text-2xl font-bold text-purple-600 dark:text-purple-400">0</p>
|
||||
</div>
|
||||
<div class="w-12 h-12 bg-purple-100 dark:bg-purple-900/50 rounded-full flex items-center justify-center text-2xl">
|
||||
📢
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Content Grid (Card View) -->
|
||||
<div id="cardView" class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6">
|
||||
<!-- Health Column -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md flex flex-col border border-gray-100 dark:border-slate-700 overflow-hidden animate-slide-up" style="animation-delay: 0.1s">
|
||||
<div class="bg-gradient-to-r from-green-500 to-green-600 p-4 flex-shrink-0">
|
||||
<h2 class="text-lg font-bold text-white flex items-center gap-2">
|
||||
<span>🏥</span> 健康 (Health)
|
||||
</h2>
|
||||
</div>
|
||||
<div id="health-list" class="p-4 space-y-4 min-h-[200px] bg-gray-50 dark:bg-slate-800/50 flex-1 overflow-y-auto scrollbar-thin">
|
||||
<p class="text-gray-400 dark:text-gray-500 text-center py-4">No news collected yet.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Lifestyle Column -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md flex flex-col border border-gray-100 dark:border-slate-700 overflow-hidden animate-slide-up" style="animation-delay: 0.2s">
|
||||
<div class="bg-gradient-to-r from-orange-500 to-orange-600 p-4 flex-shrink-0">
|
||||
<h2 class="text-lg font-bold text-white flex items-center gap-2">
|
||||
<span>🏡</span> 生活 (Lifestyle)
|
||||
</h2>
|
||||
</div>
|
||||
<div id="lifestyle-list" class="p-4 space-y-4 min-h-[200px] bg-gray-50 dark:bg-slate-800/50 flex-1 overflow-y-auto scrollbar-thin">
|
||||
<p class="text-gray-400 dark:text-gray-500 text-center py-4">No news collected yet.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Economy Column -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md flex flex-col border border-gray-100 dark:border-slate-700 overflow-hidden animate-slide-up" style="animation-delay: 0.3s">
|
||||
<div class="bg-gradient-to-r from-blue-500 to-blue-600 p-4 flex-shrink-0">
|
||||
<h2 class="text-lg font-bold text-white flex items-center gap-2">
|
||||
<span>💼</span> 経済 (Economy)
|
||||
</h2>
|
||||
</div>
|
||||
<div id="economy-list" class="p-4 space-y-4 min-h-[200px] bg-gray-50 dark:bg-slate-800/50 flex-1 overflow-y-auto scrollbar-thin">
|
||||
<p class="text-gray-400 dark:text-gray-500 text-center py-4">No news collected yet.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Society Column -->
|
||||
<div class="bg-white dark:bg-slate-800 rounded-xl shadow-md flex flex-col border border-gray-100 dark:border-slate-700 overflow-hidden animate-slide-up" style="animation-delay: 0.4s">
|
||||
<div class="bg-gradient-to-r from-purple-500 to-purple-600 p-4 flex-shrink-0">
|
||||
<h2 class="text-lg font-bold text-white flex items-center gap-2">
|
||||
<span>📢</span> 社会 (Society)
|
||||
</h2>
|
||||
</div>
|
||||
<div id="society-list" class="p-4 space-y-4 min-h-[200px] bg-gray-50 dark:bg-slate-800/50 flex-1 overflow-y-auto scrollbar-thin">
|
||||
<p class="text-gray-400 dark:text-gray-500 text-center py-4">No news collected yet.</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</main>
|
||||
|
||||
<!-- Custom Confirmation Modal -->
|
||||
<div id="confirmModal" class="hidden fixed inset-0 z-50 overflow-y-auto" aria-labelledby="modal-title" role="dialog" aria-modal="true">
|
||||
<div class="flex items-end justify-center min-h-screen pt-4 px-4 pb-20 text-center sm:block sm:p-0">
|
||||
<div class="fixed inset-0 bg-gray-500/75 dark:bg-black/75 transition-opacity backdrop-blur-sm" aria-hidden="true"></div>
|
||||
<span class="hidden sm:inline-block sm:align-middle sm:h-screen" aria-hidden="true">​</span>
|
||||
<div class="inline-block align-bottom bg-white dark:bg-slate-800 rounded-xl text-left overflow-hidden shadow-xl transform transition-all sm:my-8 sm:align-middle sm:max-w-lg sm:w-full animate-slide-up">
|
||||
<div class="bg-white dark:bg-slate-800 px-4 pt-5 pb-4 sm:p-6 sm:pb-4">
|
||||
<div class="sm:flex sm:items-start">
|
||||
<div class="mx-auto flex-shrink-0 flex items-center justify-center h-12 w-12 rounded-full bg-blue-100 dark:bg-blue-900/50 sm:mx-0 sm:h-10 sm:w-10">
|
||||
<svg class="h-6 w-6 text-blue-600 dark:text-blue-400" xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke="currentColor">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" />
|
||||
</svg>
|
||||
</div>
|
||||
<div class="mt-3 text-center sm:mt-0 sm:ml-4 sm:text-left">
|
||||
<h3 class="text-lg leading-6 font-medium text-gray-900 dark:text-white" id="modal-title">
|
||||
Start News Collection?
|
||||
</h3>
|
||||
<div class="mt-2">
|
||||
<p class="text-sm text-gray-500 dark:text-gray-400">
|
||||
This will fetch the latest news from Yahoo Japan for all 4 categories. It includes
|
||||
downloading full article content, so it may take <strong class="text-gray-700 dark:text-gray-300">30-60 seconds</strong> to complete.
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="bg-gray-50 dark:bg-slate-700/50 px-4 py-3 sm:px-6 sm:flex sm:flex-row-reverse">
|
||||
<button id="modalConfirmBtn" type="button"
|
||||
class="w-full inline-flex justify-center rounded-lg border border-transparent shadow-sm px-4 py-2 bg-blue-600 text-base font-medium text-white hover:bg-blue-700 focus:outline-none focus:ring-2 focus:ring-offset-2 focus:ring-blue-500 sm:ml-3 sm:w-auto sm:text-sm transition-colors">
|
||||
Start Collection
|
||||
</button>
|
||||
<button id="modalCancelBtn" type="button"
|
||||
class="mt-3 w-full inline-flex justify-center rounded-lg border border-gray-300 dark:border-slate-600 shadow-sm px-4 py-2 bg-white dark:bg-slate-700 text-base font-medium text-gray-700 dark:text-gray-300 hover:bg-gray-50 dark:hover:bg-slate-600 focus:outline-none focus:ring-2 focus:ring-offset-2 focus:ring-indigo-500 sm:mt-0 sm:ml-3 sm:w-auto sm:text-sm transition-colors">
|
||||
Cancel
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Article Detail Modal -->
|
||||
<div id="articleModal" class="hidden fixed inset-0 z-50 overflow-y-auto" aria-labelledby="article-modal-title" role="dialog" aria-modal="true">
|
||||
<div class="flex items-center justify-center min-h-screen pt-4 px-4 pb-20 text-center sm:p-0">
|
||||
<div class="fixed inset-0 bg-gray-500/75 dark:bg-black/75 transition-opacity backdrop-blur-sm" aria-hidden="true" id="articleModalOverlay"></div>
|
||||
<div class="inline-block align-bottom bg-white dark:bg-slate-800 rounded-xl text-left overflow-hidden shadow-xl transform transition-all sm:my-8 sm:align-middle sm:max-w-3xl sm:w-full max-h-[90vh] animate-slide-up">
|
||||
<div class="absolute top-4 right-4 z-10">
|
||||
<button id="closeArticleModal" class="p-2 rounded-full bg-white/80 dark:bg-slate-700/80 hover:bg-white dark:hover:bg-slate-600 shadow-lg transition-colors">
|
||||
<svg class="w-5 h-5 text-gray-600 dark:text-gray-300" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M6 18L18 6M6 6l12 12"/>
|
||||
</svg>
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Modal Content -->
|
||||
<div class="overflow-y-auto max-h-[90vh]">
|
||||
<!-- Article Image -->
|
||||
<div class="relative h-64 bg-gray-200 dark:bg-slate-700">
|
||||
<img id="modalImage" src="" class="w-full h-full object-cover" alt="Article image">
|
||||
<div class="absolute bottom-0 left-0 right-0 bg-gradient-to-t from-black/70 to-transparent p-6">
|
||||
<span id="modalCategory" class="px-3 py-1 rounded-full text-xs font-medium bg-blue-500 text-white"></span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Article Content -->
|
||||
<div class="p-6">
|
||||
<h2 id="modalTitle" class="text-2xl font-bold text-gray-900 dark:text-white mb-3"></h2>
|
||||
|
||||
<div class="flex items-center gap-4 text-sm text-gray-500 dark:text-gray-400 mb-4">
|
||||
<span id="modalSource" class="flex items-center gap-1">
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 20H5a2 2 0 01-2-2V6a2 2 0 012-2h10a2 2 0 012 2v1m2 13a2 2 0 01-2-2V7m2 13a2 2 0 002-2V9a2 2 0 00-2-2h-2m-4-3H9M7 16h6M7 8h6v4H7V8z"/>
|
||||
</svg>
|
||||
</span>
|
||||
<span id="modalTime" class="flex items-center gap-1">
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z"/>
|
||||
</svg>
|
||||
</span>
|
||||
</div>
|
||||
|
||||
<div class="prose dark:prose-invert max-w-none">
|
||||
<p id="modalContent" class="text-gray-700 dark:text-gray-300 leading-relaxed whitespace-pre-wrap"></p>
|
||||
</div>
|
||||
|
||||
<div class="mt-6 pt-4 border-t border-gray-200 dark:border-slate-700">
|
||||
<a id="modalLink" href="#" target="_blank"
|
||||
class="inline-flex items-center gap-2 px-4 py-2 bg-blue-600 hover:bg-blue-700 text-white rounded-lg transition-colors">
|
||||
<span>Read Full Article</span>
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14"/>
|
||||
</svg>
|
||||
</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
// Elements
|
||||
const collectBtn = document.getElementById('collectBtn');
|
||||
const downloadBtn = document.getElementById('downloadBtn');
|
||||
const loading = document.getElementById('loading');
|
||||
const dateList = document.getElementById('date-list');
|
||||
const currentViewLabel = document.getElementById('current-view-label');
|
||||
const confirmModal = document.getElementById('confirmModal');
|
||||
const modalConfirmBtn = document.getElementById('modalConfirmBtn');
|
||||
const modalCancelBtn = document.getElementById('modalCancelBtn');
|
||||
const articleModal = document.getElementById('articleModal');
|
||||
const articleModalOverlay = document.getElementById('articleModalOverlay');
|
||||
const closeArticleModal = document.getElementById('closeArticleModal');
|
||||
const darkModeToggle = document.getElementById('darkModeToggle');
|
||||
|
||||
const categories = {
|
||||
'Health': document.getElementById('health-list'),
|
||||
'Lifestyle': document.getElementById('lifestyle-list'),
|
||||
'Economy': document.getElementById('economy-list'),
|
||||
'Society': document.getElementById('society-list')
|
||||
};
|
||||
|
||||
const categoryColors = {
|
||||
'Health': { bg: 'bg-green-100 dark:bg-green-900/30', text: 'text-green-600 dark:text-green-400', badge: 'bg-green-500' },
|
||||
'Lifestyle': { bg: 'bg-orange-100 dark:bg-orange-900/30', text: 'text-orange-600 dark:text-orange-400', badge: 'bg-orange-500' },
|
||||
'Economy': { bg: 'bg-blue-100 dark:bg-blue-900/30', text: 'text-blue-600 dark:text-blue-400', badge: 'bg-blue-500' },
|
||||
'Society': { bg: 'bg-purple-100 dark:bg-purple-900/30', text: 'text-purple-600 dark:text-purple-400', badge: 'bg-purple-500' }
|
||||
};
|
||||
|
||||
let selectedDate = null;
|
||||
let currentNewsData = {};
|
||||
let allArticles = [];
|
||||
|
||||
// Dark Mode
|
||||
function initDarkMode() {
|
||||
if (localStorage.getItem('darkMode') === 'true' ||
|
||||
(!localStorage.getItem('darkMode') && window.matchMedia('(prefers-color-scheme: dark)').matches)) {
|
||||
document.documentElement.classList.add('dark');
|
||||
}
|
||||
}
|
||||
|
||||
darkModeToggle.addEventListener('click', () => {
|
||||
document.documentElement.classList.toggle('dark');
|
||||
localStorage.setItem('darkMode', document.documentElement.classList.contains('dark'));
|
||||
});
|
||||
|
||||
function createArticleCard(article, index) {
|
||||
const card = document.createElement('div');
|
||||
card.className = 'bg-white dark:bg-slate-700 border border-gray-100 dark:border-slate-600 rounded-lg overflow-hidden card-hover cursor-pointer animate-fade-in';
|
||||
card.style.animationDelay = `${index * 0.1}s`;
|
||||
|
||||
const imgSrc = article.image_url || 'https://via.placeholder.com/300x200?text=No+Image';
|
||||
const summary = article.content ? article.content.substring(0, 80) + '...' : '';
|
||||
|
||||
card.innerHTML = `
|
||||
<div class="relative overflow-hidden h-32">
|
||||
<img src="${imgSrc}" class="w-full h-full object-cover transform hover:scale-110 transition duration-500" alt="news image" onerror="this.src='https://via.placeholder.com/300x200?text=News'">
|
||||
<span class="absolute bottom-0 right-0 bg-black/60 text-white text-[10px] px-2 py-1 rounded-tl">${article.source}</span>
|
||||
</div>
|
||||
<div class="p-3">
|
||||
<h3 class="font-bold text-sm mb-2 leading-snug line-clamp-2 text-gray-900 dark:text-white group-hover:text-blue-600">
|
||||
${article.title}
|
||||
</h3>
|
||||
${summary ? `<p class="text-xs text-gray-500 dark:text-gray-400 line-clamp-2 mb-2">${summary}</p>` : ''}
|
||||
<div class="flex justify-between items-center mt-2">
|
||||
<p class="text-xs text-gray-500 dark:text-gray-400">${article.collected_at.split('T')[1].substring(0, 5)}</p>
|
||||
<span class="text-xs text-blue-600 dark:text-blue-400 hover:underline">Read more →</span>
|
||||
</div>
|
||||
</div>
|
||||
`;
|
||||
|
||||
card.onclick = () => openArticleModal(article);
|
||||
return card;
|
||||
}
|
||||
|
||||
function openArticleModal(article) {
|
||||
const colors = categoryColors[article.category] || categoryColors['Economy'];
|
||||
|
||||
document.getElementById('modalImage').src = article.image_url || 'https://via.placeholder.com/800x400?text=No+Image';
|
||||
document.getElementById('modalCategory').textContent = article.category;
|
||||
document.getElementById('modalCategory').className = `px-3 py-1 rounded-full text-xs font-medium ${colors.badge} text-white`;
|
||||
document.getElementById('modalTitle').textContent = article.title;
|
||||
document.getElementById('modalSource').innerHTML = `
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 20H5a2 2 0 01-2-2V6a2 2 0 012-2h10a2 2 0 012 2v1m2 13a2 2 0 01-2-2V7m2 13a2 2 0 002-2V9a2 2 0 00-2-2h-2m-4-3H9M7 16h6M7 8h6v4H7V8z"/>
|
||||
</svg>
|
||||
${article.source}
|
||||
`;
|
||||
document.getElementById('modalTime').innerHTML = `
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z"/>
|
||||
</svg>
|
||||
${article.collected_at.replace('T', ' ').substring(0, 16)}
|
||||
`;
|
||||
document.getElementById('modalContent').textContent = article.content || 'No content available.';
|
||||
document.getElementById('modalLink').href = article.url;
|
||||
|
||||
articleModal.classList.remove('hidden');
|
||||
document.body.style.overflow = 'hidden';
|
||||
}
|
||||
|
||||
function closeArticleModalFunc() {
|
||||
articleModal.classList.add('hidden');
|
||||
document.body.style.overflow = '';
|
||||
}
|
||||
|
||||
closeArticleModal.addEventListener('click', closeArticleModalFunc);
|
||||
articleModalOverlay.addEventListener('click', closeArticleModalFunc);
|
||||
|
||||
// Statistics
|
||||
function updateStats() {
|
||||
let total = 0;
|
||||
const stats = { Health: 0, Lifestyle: 0, Economy: 0, Society: 0 };
|
||||
|
||||
for (const [category, articles] of Object.entries(currentNewsData)) {
|
||||
stats[category] = articles.length;
|
||||
total += articles.length;
|
||||
}
|
||||
|
||||
document.getElementById('stat-total').textContent = total;
|
||||
document.getElementById('stat-health').textContent = stats.Health;
|
||||
document.getElementById('stat-lifestyle').textContent = stats.Lifestyle;
|
||||
document.getElementById('stat-economy').textContent = stats.Economy;
|
||||
document.getElementById('stat-society').textContent = stats.Society;
|
||||
}
|
||||
|
||||
// Fetch functions
|
||||
async function fetchDates() {
|
||||
try {
|
||||
const response = await fetch('/api/dates');
|
||||
const data = await response.json();
|
||||
renderDates(data.dates);
|
||||
} catch (error) {
|
||||
console.error('Error fetching dates:', error);
|
||||
}
|
||||
}
|
||||
|
||||
function renderDates(dates) {
|
||||
dateList.innerHTML = '';
|
||||
if (dates.length === 0) {
|
||||
dateList.innerHTML = '<p class="text-gray-500 dark:text-gray-400 text-sm px-4">No history.</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
dates.forEach((date, index) => {
|
||||
const btn = document.createElement('button');
|
||||
btn.className = `w-full text-left px-4 py-3 rounded-lg transition-all duration-200 text-sm flex justify-between items-center group animate-slide-in ${
|
||||
selectedDate === date
|
||||
? 'bg-blue-100 dark:bg-blue-900/50 text-blue-700 dark:text-blue-300 border-l-4 border-blue-500'
|
||||
: 'hover:bg-gray-100 dark:hover:bg-slate-700 text-gray-700 dark:text-gray-300'
|
||||
}`;
|
||||
btn.style.animationDelay = `${index * 0.05}s`;
|
||||
btn.innerHTML = `
|
||||
<span class="flex items-center gap-2">
|
||||
<svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M8 7V3m8 4V3m-9 8h10M5 21h14a2 2 0 002-2V7a2 2 0 00-2-2H5a2 2 0 00-2 2v12a2 2 0 002 2z"/>
|
||||
</svg>
|
||||
${date}
|
||||
</span>
|
||||
<svg class="w-4 h-4 opacity-0 group-hover:opacity-100 transition-opacity" fill="none" stroke="currentColor" viewBox="0 0 24 24">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5l7 7-7 7"/>
|
||||
</svg>
|
||||
`;
|
||||
btn.onclick = () => selectDate(date);
|
||||
dateList.appendChild(btn);
|
||||
});
|
||||
}
|
||||
|
||||
async function selectDate(date) {
|
||||
selectedDate = date;
|
||||
currentViewLabel.textContent = `Viewing: ${date}`;
|
||||
await fetchDates();
|
||||
await fetchNews(date);
|
||||
}
|
||||
|
||||
async function fetchNews(date = null) {
|
||||
loading.classList.remove('hidden');
|
||||
try {
|
||||
let url = '/api/news';
|
||||
if (date) {
|
||||
url += `?date=${date}`;
|
||||
}
|
||||
const response = await fetch(url);
|
||||
const data = await response.json();
|
||||
currentNewsData = data;
|
||||
|
||||
// Flatten articles
|
||||
allArticles = [];
|
||||
for (const [category, articles] of Object.entries(data)) {
|
||||
articles.forEach(article => {
|
||||
allArticles.push({ ...article, category });
|
||||
});
|
||||
}
|
||||
|
||||
renderNews(data);
|
||||
updateStats();
|
||||
|
||||
if (!date) fetchDates();
|
||||
|
||||
} catch (error) {
|
||||
console.error('Error fetching news:', error);
|
||||
} finally {
|
||||
loading.classList.add('hidden');
|
||||
}
|
||||
}
|
||||
|
||||
function renderNews(data) {
|
||||
for (const [category, articles] of Object.entries(data)) {
|
||||
const container = categories[category];
|
||||
if (!container) continue;
|
||||
|
||||
container.innerHTML = '';
|
||||
if (articles.length === 0) {
|
||||
container.innerHTML = '<div class="flex flex-col items-center justify-center h-40 text-gray-400 dark:text-gray-500"><span class="text-2xl mb-2">📭</span><p>No news available.</p></div>';
|
||||
continue;
|
||||
}
|
||||
|
||||
articles.forEach((article, index) => {
|
||||
container.appendChild(createArticleCard({ ...article, category }, index));
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Modal handlers
|
||||
collectBtn.addEventListener('click', () => {
|
||||
confirmModal.classList.remove('hidden');
|
||||
});
|
||||
|
||||
modalCancelBtn.addEventListener('click', () => {
|
||||
confirmModal.classList.add('hidden');
|
||||
});
|
||||
|
||||
modalConfirmBtn.addEventListener('click', async () => {
|
||||
confirmModal.classList.add('hidden');
|
||||
loading.classList.remove('hidden');
|
||||
|
||||
try {
|
||||
await fetch('/api/collect-news', { method: 'POST' });
|
||||
|
||||
const now = new Date();
|
||||
const year = now.getFullYear();
|
||||
const month = String(now.getMonth() + 1).padStart(2, '0');
|
||||
const day = String(now.getDate()).padStart(2, '0');
|
||||
const today = `${year}-${month}-${day}`;
|
||||
|
||||
await fetchDates();
|
||||
await selectDate(today);
|
||||
|
||||
} catch (error) {
|
||||
console.error('Error collecting news:', error);
|
||||
} finally {
|
||||
loading.classList.add('hidden');
|
||||
}
|
||||
});
|
||||
|
||||
downloadBtn.addEventListener('click', () => {
|
||||
const dateStr = selectedDate ? `?date=${selectedDate}` : '';
|
||||
window.location.href = `/api/download-json${dateStr}`;
|
||||
});
|
||||
|
||||
// Keyboard shortcuts
|
||||
document.addEventListener('keydown', (e) => {
|
||||
if (e.key === 'Escape') {
|
||||
closeArticleModalFunc();
|
||||
confirmModal.classList.add('hidden');
|
||||
}
|
||||
});
|
||||
|
||||
// Initialize
|
||||
initDarkMode();
|
||||
fetchDates().then(() => {
|
||||
fetchNews();
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
|
||||
</html>
|
||||
Reference in New Issue
Block a user