
Role summary
We’re hiring a senior Systems Administrator who will own the infrastructure and engineering for an enterprise knowledge platform built on Confluence + Jira, enriched by an AI metadata layer, and exposed as an interactive Q/A forum and social integrations (Telegram prioritized). This is a hands-on systems role — you will design, build, secure, operate, and scale the end-to-end system, mentor junior engineers, and propose continuous improvements to reliability, performance, and metadata quality.
What you’ll own
- Full lifecycle delivery of a knowledge platform: Confluence + Jira + custom macros/plugins + AI metadata services + Q/A forum.
- Integration of the knowledge platform with social/messaging apps (Telegram first; expand to others as needed).
- Metadata acquisition, enrichment, and generation pipelines that use machine learning / LLMs for auto-tagging, summarization, and semantic search.
- System reliability: monitoring, alerting, backups, observability, incident response, and capacity planning.
- Documentation, runbooks, and operational playbooks for the platform and for handover to stakeholders.
- Cross-team collaboration with Content Owners, KM Managers, Product, and Security to ensure data governance and compliance.
Key responsibilities
- Architect, implement and operate Confluence and Jira instances for enterprise scale (users, spaces, pages, permissions).
- Develop custom Confluence macros, Jira extensions, and automation rules to improve UX and KM workflows.
- Design and build metadata pipelines:
- Acquire metadata from content (files, pages, attachments).
- Enrich metadata with AI (summaries, topics, entity extraction, skills, confidence scores).
- Generate structured metadata (tags, owners, versioning, lifecycle state).
- Build and maintain a moderated Q/A forum (similar to StackOverflow/Tech Stack) connected to Confluence knowledge base:
- Authentication/SSO, moderation tools, reputation or points logic, search and linking to source SOPs.
- Integrate systems with Telegram (bots, notifications, query → answer flows), and other messaging platforms as required.
- Implement semantic search and retrieval (vector embeddings, index management, hybrid search).
- Ensure data security and governance: encryption at rest/in transit, access controls, audit trails, PII handling, backups.
- Monitor metrics and propose solutions to improve metadata accuracy, search relevance, and system throughput.
- Capacity planning and scaling (vertical/horizontal, caching, queueing, database scaling, autoscaling policies).
- Propose and implement CI/CD pipelines for code and infrastructure; automate tests, deployments, and canary rollouts.
- Mentor junior engineers and create onboarding materials for Ops and KM teams.
Required skills & experience (must-haves)
- 5+ years in systems administration / site reliability / platform engineering with demonstrable senior responsibilities.
- Deep, production experience with Atlassian stack: Confluence and Jira — including creating custom macros, templates, and automation rules.
- Hands-on experience developing Confluence macros/plugins or Jira apps (server/DC or cloud variants) — ability to write/maintain plugin code and adapt macros to KM needs.
- Practical experience building AI/ML powered metadata pipelines:
- Familiarity with LLMs / NLP workflows (prompting, embeddings, summarization, entity extraction).
- Experience with vector databases or embedding indexes (Faiss, Milvus, Pinecone, Annoy, etc.) or equivalent semantic search systems.
- Backend development skills: Python and/or Node.js/TypeScript (ability to implement bots, microservices, ETL).
- Experience designing and operating bots/integrations with Telegram (Telegram Bot API, webhooks) and familiarity with other messaging platforms is a plus.
- Strong infra skills: Docker, Kubernetes (or other container orchestration), Terraform/CloudFormation, CI/CD (GitHub Actions/GitLab/Jenkins).
- Databases: PostgreSQL (primary), ElasticSearch / OpenSearch (search), or equivalent.
- Observability & SRE tools: Prometheus, Grafana, logs (ELK/EFK), alerting.
- Security & identity: SSO (SAML/OAuth/OpenID Connect), RBAC, TLS, secrets management (Vault/Cloud Secrets).
- Excellent troubleshooting and incident response skills; can debug complex production issues across app + infra + AI layers.
- Strong written and verbal communication — able to produce operational runbooks, docs, and present technical proposals to leadership.
Nice-to-have
- Prior work building knowledge markets, Q/A forums, or community platforms.
- Experience with Atlassian ScriptRunner, Atlassian Connect, Forge, or similar extension frameworks.
- Familiarity with RAG (Retrieval-Augmented Generation) patterns and prompt engineering best practices.
- Experience with data governance frameworks and compliance (GDPR, local data rules).
- Experience with message queues (RabbitMQ, Kafka) for eventing and workflows.
- ML model lifecycle familiarity (model deployment, monitoring, drift detection).
Behavioral & leadership expectations
- Ownership mindset — you take a problem from concept to production and remain accountable for its health.
- Collaborative — work with KM owners, product managers, and legal/security teams.
- Continuous improvement — propose architecture and process changes backed by clear metrics.
- Mentorship — uplift junior teammates and formalize operational knowledge.
ثبت مشکل و تخلف آگهی
ارسال رزومه برای ونچر استودیوی آرگومان